Herbicide target genes and methods

ABSTRACT

The invention relates to genes isolated from Arabidopsis that code for proteins essential for normal plant development. The invention also includes the methods of using these proteins to discover new herbicides, based on the essentiality of the genes for normal growth and development. The invention can also be used in a screening assay to identify inhibitors that are potential herbicides. The invention is also applied to the development of herbicide tolerant plants, plant tissues, plant seeds, and plant cells.

[0001] This application claims the benefit of U.S. ProvisionalApplication No. 60/214,819, filed Jun. 28, 2000, incorporated herein byreference in its entirety.

FIELD OF THE INVENTION

[0002] The invention relates to genes isolated from Arabidopsis thalianathat encode proteins essential for plant growth and development. Theinvention also includes the methods of using these proteins as herbicidetargets, based on the essentiality of these genes for normal growth anddevelopment. The invention is also useful as a screening assay toidentify inhibitors that are potential herbicides. The invention mayalso be applied to the development of herbicide tolerant plants, planttissues, plant seeds, and plant cells.

BACKGROUND OF THE INVENTION

[0003] The use of herbicides to control undesirable vegetation such asweeds in crop fields has become almost a universal practice. Theherbicide market exceeds 15 billion dollars annually. Despite thisextensive use, weed control remains a significant and costly problem forfarmers.

[0004] Effective use of herbicides requires sound management. Forinstance, the time and method of application and stage of weed plantdevelopment are critical to getting good weed control with herbicides.Since various weed species are resistant to herbicides, the productionof effective new herbicides becomes increasingly important. Novelherbicides can now be discovered using high-throughput screens thatimplement recombinant DNA technology. Metabolic enzymes found to beessential to plant growth and development can be recombinantly producedthrough standard molecular biological techniques and utilized asherbicide targets in screens for novel inhibitors of the enzymeactivity. The novel inhibitors discovered through such screens may thenbe used as herbicides to control undesirable vegetation.

[0005] Herbicides that exhibit greater potency, broader weed spectrum,and more rapid degradation in soil can also, unfortunately, have greatercrop phytotoxicity. One solution applied to this problem has been todevelop crops that are resistant or tolerant to herbicides. Crop hybridsor varieties tolerant to the herbicides allow for the use of theherbicides to kill weeds without attendant risk of damage to the crop.Development of tolerance can allow application of a herbicide to a cropwhere its use was previously precluded or limited (e.g. to pre-emergenceuse) due to sensitivity of the crop to the herbicide. For example, U.S.Pat. No. 4,761,373 to Anderson et al. is directed to plants resistant tovarious imidazolinone or sulfonamide herbicides. This resistance isconferred by an altered acetohydroxyacid synthase (AHAS) enzyme. U.S.Pat. No. 4,975,374 to Goodman et al. relates to plant cells and plantscontaining a gene encoding a mutant glutamine synthetase (GS) resistantto inhibition by herbicides that were known to inhibit GS, e.g.phosphinothricin and methionine sulfoximine. U.S. Pat. No. 5,013,659 toBedbrook et al. is directed to plants expressing a mutant acetolactatesynthase that renders the plants resistant to inhibition by sulfonylureaherbicides. U.S. Pat. No. 5,162,602 to Somers et al. discloses plantstolerant to inhibition by cyclohexanedione and aryloxyphenoxypropanoicacid herbicides. The tolerance is conferred by an altered acetylcoenzyme A carboxylase (ACCase).

[0006] Notwithstanding the above-described advancements, there remains apersistent and ongoing problem with unwanted or detrimental vegetationgrowth (e.g. weeds). Furthermore, as the population continues to grow,there will be increasing food shortages. Therefore, there exists a longfelt, yet unfulfilled need, to find new, effective, and economicherbicides.

SUMMARY OF THE INVENTION

[0007] It is an object of the invention to provide an effective andbeneficial method to identify novel herbicides. A feature of theinvention is the identification of a gene in A. thaliana, hereinreferred to as the 1917 gene, which shows sequence similarity to arginyltRNA synthetase (Girjes et al. (1995) Gene, 164: 347-350; GenBankaccession # Z98760 for this Arabidopsis gene). A feature of theinvention is the identification of a gene in A. thaliana, hereinreferred to as the 2092 gene, which shows sequence similarity to alanyltRNA synthetase (Mireau et al. (1996) The Plant Cell 8: 1027-1039). Afeature of the invention is the identification of a gene in A. thaliana,herein referred to as the 7724 gene, which shows sequence similarity to2′ tRNA phosphotransferase (Culver et al. (1997) J Biol Chemistry,272:13203-13210; Spinelli et al. (1999) J. Biol. Chemistry,274:2637-2644; Spinelli et al. (1997) RNA, 3:1388-1400). Another featureof the invention is the discovery that the 1917, 2092, and 7724 genesare essential for normal growth and development. An advantage of thepresent invention is that the newly discovered essential genes providethe basis for identity of a novel herbicidal mode of action whichenables one skilled in the art to easily and rapidly discover novelinhibitors of gene function useful as herbicides.

[0008] One object of the present invention is to provide essential genesin plants for assay development for discovery of inhibitory compoundswith herbicidal activity. Genetic results show that when any one of the1917, 2092, or 7724 genes is mutated in Arabidopsis thaliana, theresulting phenotype is lethal in the homozygous state. This suggests acritical role for the gene products encoded by the 1917, 2092, and 7724genes.

[0009] Using T-DNA insertion mutagenesis, the inventors of the presentinvention have demonstrated that the activity of each of the 1917, 2092,or 7724 gene products is essential for A. thaliana growth. This impliesthat chemicals, which inhibit the function of the 1917-, 2092-, or7724-encoded proteins in plants, are likely to have detrimental effectson plants and are potentially good herbicide candidates. The presentinvention therefore provides methods of using a purified protein encodedby any of the 1917, 2092, or 7724 gene sequences described below toidentify inhibitors thereof, which can then be used as herbicides tosuppress the growth of undesirable vegetation, e.g. in fields wherecrops are grown, particularly agronomically important crops such asmaize and other cereal crops such as wheat, oats, rye, sorghum, rice,barley, millet, turf and forage grasses, and the like, as well ascotton, sugar cane, sugar beet, oilseed rape, and soybeans.

[0010] The present invention discloses novel nucleotide sequencesderived from A. thaliana, designated the 1917, 2092, or 7724 genes. Thenucleotide sequences of the coding regions for the cDNA clones are setforth in SEQ ID NO:1, SEQ ID NO:3, and SEQ ID NO:5, respectively, andthe corresponding amino acid sequences of the 1917-, 2092-, and7724-encoded proteins are set forth in SEQ ID NO:2, SEQ ID NO:4, and SEQID NO:6, respectively. The present invention also includes nucleotidesequences substantially similar to those set forth in SEQ ID NO:1, SEQID NO:3, and SEQ ID NO:5, respectively. The present invention alsoencompasses plant proteins whose amino acid sequence are substantiallysimilar to the amino acid sequences set forth in SEQ ID NO:2, SEQ IDNO:4, and SEQ ID NO:6, respectively. The present invention also includesmethods of using the 1917, 2092, or 7724 gene products as herbicidetargets, based on the essentiality of these genes for normal growth anddevelopment. Furthermore, the invention can be used in a screening assayto identify inhibitors of 1917, 2092, or 7724 gene function that arepotential herbicides.

[0011] In a preferred embodiment, the present invention relates to amethod for identifying chemicals having the ability to inhibit 1917,2092, or 7724 activity in plants preferably comprising the steps of: a)obtaining transgenic plants, plant tissue, plant seeds or plant cells,preferably stably transformed, comprising a non-native nucleotidesequence encoding an enzyme having 1917, 2092, or 7724 activity andcapable of overexpressing an enzymatically active 1917, 2092, or 7724gene product (either full length or truncated but still active); b)applying a chemical to the transgenic plants, plant cells, tissues orparts and to the isogenic non-transformed plants, plant cells, tissuesor parts; c) determining the growth or viability of the transgenic andnon-transformed plants, plant cells, tissues after application of thechemical; d) comparing the growth or viability of the transgenic andnon-transformed plants, plant cells, tissues after application of thechemical; and e) selecting chemicals that suppress the viability orgrowth of the non-transgenic plants, plant cells, tissues or parts,without significantly suppressing the growth of the viability or growthof the isogenic transgenic plants, plant cells, tissues or parts. In apreferred embodiment, the enzyme having 1917, 2092, or 7724 activity isencoded by a nucleotide sequence derived from a plant, preferablyArabidopsis thaliana, desirably identical or substantially similar tothe nucleotide sequence set forth in SEQ ID NO:1, SEQ ID NO:3, and SEQID NO:5, respectively. In another embodiment, the enzyme having 1917,2092, or 7724 activity is encoded by a nucleotide sequence capable ofencoding the amino acid sequence of SEQ ID NO:2, SEQ ID NO:4, and SEQ IDNO:6, respectively. In yet another embodiment, the enzyme having 1917,2092, or 7724 activity has an amino acid sequence identical orsubstantially similar to the amino acid sequence set forth in SEQ IDNO:2, SEQ ID NO:4, and SEQ ID NO:6, respectively.

[0012] The present invention further embodies plants, plant tissues,plant seeds, and plant cells that have modified 1917, 2092, or 7724activity and that are therefore tolerant to inhibition by a herbicide atlevels normally inhibitory to naturally occurring 1917, 2092, or7724-encoded activity. Herbicide tolerant plants encompassed by theinvention include those that would otherwise be potential targets for1917, 2092, or 7724-inhibiting herbicides, particularly theagronomically important crops mentioned above. According to thisembodiment, plants, plant tissue, plant seeds, or plant cells aretransformed, preferably stably transformed, with a recombinant DNAmolecule comprising a suitable promoter functional in plants operativelylinked to a nucleotide sequence that encodes a modified 1917, 2092, or7724 gene that is tolerant to inhibition by a herbicide at aconcentration that would normally inhibit the activity of wild-type,unmodified 1917, 2092, or 7724 gene product. Modified 1917, 2092, or7724 activity may also be conferred upon a plant by increasingexpression of wild-type herbicide-sensitive 1917, 2092, or 7724 proteinby providing multiple copies of wild-type 1917, 2092, or 7724 genes tothe plant or by overexpression of wild-type 1917, 2092, or 7724 genesunder control of a stronger-than-wild-type promoter. The transgenicplants, plant tissue, plant seeds, or plant cells thus created are thenselected using conventional techniques, whereby herbicide tolerant linesare isolated, characterized, and developed. Alternately, random orsite-specific mutagenesis may be used to generate herbicide tolerantlines.

[0013] Therefore, the present invention provides a plant, plant cell,plant seed, or plant tissue transformed with a DNA molecule comprising anucleotide sequence isolated from a plant that encodes an enzyme having1917, 2092, or 7724 activity, wherein the DNA expresses the 1917, 2092,or 7724 activity and wherein the DNA molecule confers upon the plant,plant cell, plant seed, or plant tissue tolerance to a herbicide inamounts that normally inhibits naturally occurring 1917, 2092, or 7724activity. According to one example of this embodiment, the enzyme having1917, 2092, or 7724 activity is encoded by a nucleotide sequenceidentical or substantially similar to the nucleotide sequence set forthin SEQ ID NO:1, SEQ ID NO:3, and SEQ ID NO:5, respectively, or has anamino acid sequence identical or substantially similar to the amino acidsequence set forth in SEQ ID NO:2, SEQ ID NO:4, and SEQ ID NO:6,respectively.

[0014] The invention also provides a method for suppressing the growthof a plant comprising the step of applying to the plant a chemical thatinhibits the naturally occurring 1917, 2092, or 7724 activity in theplant. In a related aspect, the present invention is directed to amethod for selectively suppressing the growth of undesired vegetation ina field containing a crop of planted crop seeds or plants, comprisingthe steps of: (a) optionally planting herbicide tolerant crops or cropseeds, which are plants or plant seeds that are tolerant to a herbicidethat inhibits the naturally occurring 1917, 2092, or 7724 activity; and(b) applying to the herbicide tolerant crops or crop seeds and theundesired vegetation in the field a herbicide in amounts that inhibitnaturally occurring 1917, 2092, or 7724 activity, wherein the herbicidesuppresses the growth of the weeds without significantly suppressing thegrowth of the crops.

[0015] The invention thus provides an isolated DNA molecule comprising anucleotide sequence substantially similar to SEQ ID NO:1, SEQ ID NO:3,or SEQ ID NO:5, respectively. In a preferred embodiment, the nucleotidesequence encodes an amino acid sequence substantially similar to SEQ IDNO:2, SEQ ID NO:4, or SEQ ID NO:6, respectively. In another preferredembodiment, the nucleotide sequence is SEQ ID NO:1, SEQ ID NO:3, or SEQID NO:5, respectively. In yet another preferred embodiment, thenucleotide sequence encodes the amino acid sequence of SEQ ID NO:2, SEQID NO:4, or SEQ ID NO:6, respectively. Preferably, the nucleotidesequence is a plant nucleotide sequence, which preferably encodes apolypeptide having 1917, 2092, or 7724 activity, respectively.

[0016] The invention further provides a polypeptide comprising an aminoacid sequence encoded by a nucleotide sequence substantially similar toSEQ ID NO:1, SEQ ID NO:3, or SEQ ID NO:5, respectively. Preferably, theamino acid sequence is encoded by SEQ ID NO:1, SEQ ID NO:3, or SEQ IDNO:5, respectively. Preferably, the polypeptide comprises an amino acidsequence substantially similar to SEQ ID NO:2, SEQ ID NO:4, or SEQ IDNO:6, respectively. Preferably the amino acid sequence is SEQ ID NO:2,SEQ ID NO:4, or SEQ ID NO:6respectively. The amino acid sequencepreferably has 1917, 2092, or 7724 activity, respectively. In anotherpreferred embodiment, the amino acid sequence comprises at least 20consecutive amino acid residues of the amino acid sequence encoded bySEQ ID NO:1, SEQ ID NO:3, or SEQ ID NO:5, respectively. Or,alternatively, the amino acid sequence comprises at least 20 consecutiveamino acid residues of the amino acid sequence of SEQ ID NO:2, SEQ IDNO:4, or SEQ I) NO:6, respectively.

[0017] The invention further provides an expression cassette comprisinga promoter operatively linked to a DNA molecule according to the presentinvention, a recombinant vector comprising an expression cassetteaccording to the present invention, wherein said vector is preferablycapable of being stably transformed into a host cell, a host cellcomprising a DNA molecule according to the present invention, whereinsaid DNA molecule is preferably expressible in the cell. The host cellis preferably selected from the group consisting of an insect cell, ayeast cell, a prokaryotic cell and a plant cell. The invention furtherprovides a plant or seed comprising a plant cell of the presentinvention, wherein the plant or seed is preferably tolerant to aninhibitor of 1917, 2092, or 7724 activity, respectively.

[0018] The invention further provides a process for making nucleotidesequences encoding gene products having altered 1917, 2092, or 7724activity, respectively, comprising: a) shuffling an unmodifiednucleotide sequence of the present invention, b) expressing theresulting shuffled nucleotide sequences, and c) selecting for altered1917, 2092, or 7724 activity, respectively, as compared to the 1917,2092, or 7724 activity, respectively, of the gene product of saidunmodified nucleotide sequence.

[0019] In a preferred embodiment, the unmodified nucleotide sequence isidentical or substantially similar to SEQ ID NO:1, SEQ ID NO:3, or SEQID NO:5, respectively, or a homolog thereof. The present inventionfurther provides a DNA molecule comprising a shuffled nucleotidesequence obtainable by the process described above, a DNA moleculecomprising a shuffled nucleotide sequence produced by the processdescribed above. Preferably, a shuffled nucleotide sequence obtained bythe process described above has enhanced tolerance to an inhibitor of1917, 2092, or 7724 activity, respectively. The invention furtherprovides an expression cassette comprising a promoter operatively linkedto a DNA molecule comprising a shuffled nucleotide sequence arecombinant vector comprising such an expression cassette, wherein saidvector is preferably capable of being stably transformed into a hostcell, a host cell comprising such an expression cassette, wherein saidnucleotide sequence is preferably expressible in said cell. A preferredhost cell is selected from the group consisting of an insect cell, ayeast cell, a prokaryotic cell and a plant cell. The invention furtherprovides a plant or seed comprising such plant cell, wherein the plantis preferably tolerant to an inhibitor of 1917, 2092, or 7724 activity,respectively.

[0020] The invention further provides a method for selecting compoundsthat interact with the protein encoded by SEQ ID NO:1, SEQ ID NO:3, orSEQ ID NO:5, respectively, comprising: a) expressing a DNA moleculecomprising SEQ ID NO:1, SEQ ID NO:3, or SEQ ID NO:5, respectively, or asequence substantially similar to SEQ ID NO: 1, SEQ ID NO:3, or SEQ IDNO:5, respectively, or a homolog thereof, to generate the correspondingprotein, b) testing a compound suspected of having the ability tointeract with the protein expressed in step (a), and (c) selectingcompounds that interact with the protein in step (b).

[0021] The invention further provides a process of identifying aninhibitor of 1917, 2092, or 7724 activity, respectively, comprising: a)introducing a DNA molecule comprising a nucleotide sequence of SEQ IDNO:1, SEQ ID NO:3, or SEQ ID NO:5, respectively, and having 1917, 2092,or 7724 activity, respectively, or nucleotide sequences substantiallysimilar thereto, or a homolog thereof, into a plant cell, such that saidsequence is functionally expressible at levels that are higher thanwild-type expression levels, b) combining said plant cell with acompound to be tested for the ability to inhibit the 1917, 2092, or 7724activity, respectively, under conditions conducive to such inhibition,c) measuring plant cell growth under the conditions of step (b), d)comparing the growth of said plant cell with the growth of a plant cellhaving unaltered 1917, 2092, or 7724 activity, respectively, underidentical conditions, and e) selecting said compound that inhibits plantcell growth in step (d).

[0022] The invention further comprises a compound having herbicidalactivity identifiable according to the process described immediatelyabove.

[0023] The invention further comprises:

[0024] A process of identifying compounds having herbicidal activitycomprising: a) combining a protein of the present invention and acompound to be tested for the ability to interact with said protein,under conditions conducive to interaction, b) selecting a compoundidentified in step (a) that is capable of interacting with said protein,c) applying identified compound in step (b) to a plant to test forherbicidal activity, and d) selecting compounds having herbicidalactivity.

[0025] The invention further comprises a compound having herbicidalactivity identifiable according to the process described immediatelyabove.

[0026] The invention further comprises:

[0027] A method for suppressing the growth of a plant comprising,applying to said plant a compound that inhibits the activity of apolypeptide of the present invention in an amount sufficient to suppressthe growth of said plant.

[0028] The invention further comprises:

[0029] A method for recombinantly expressing a protein having 1917,2092, or 7724 activity comprising introducing a nucleotide sequenceencoding a protein having one of the above activities into a host celland expressing the nucleotide sequence in the host cell. A preferredhost cell is selected from the group consisting of an insect cell, ayeast cell, a prokaryotic cell and a plant cell. A preferred prokaryoticcell is a bacterial cell, e.g. E. coli.

[0030] Other objects and advantages of the present invention will becomeapparent to those skilled in the art from a study of the followingdescription of the invention and non-limiting examples.

Definitions

[0031] For clarity, certain terms used in the specification are definedand presented as follows:

[0032] Cofactor: natural reactant, such as an organic molecule or ametal ion, required in an enzyme-catalyzed reaction. A co-factor is e.g.NAD(P), riboflavin (including FAD and FMN), folate, molybdopterin,thiamin, biotin, lipoic acid, pantothenic acid and coenzyme A,S-adenosylmethionine, pyridoxal phosphate, ubiquinone, menaquinone.Optionally, a co-factor can be regenerated and reused.

[0033] DNA shuffling: DNA shuffling is a method to rapidly, easily andefficiently introduce mutations or rearrangements, preferably randomly,in a DNA molecule or to generate exchanges of DNA sequences between twoor more DNA molecules, preferably randomly. The DNA molecule resultingfrom DNA shuffling is a shuffled DNA molecule that is a non-naturallyoccurring DNA molecule derived from at least one template DNA molecule.The shuffled DNA encodes an enzyme modified with respect to the enzymeencoded by the template DNA, and preferably has an altered biologicalactivity with respect to the enzyme encoded by the template DNA.

[0034] Enzyme activity: means herein the ability of an enzyme tocatalyze the conversion of a substrate into a product. A substrate forthe enzyme comprises the natural substrate of the enzyme but alsocomprises analogues of the natural substrate which can also be convertedby the enzyme into a product or into an analogue of a product. Theactivity of the enzyme is measured for example by determining the amountof product in the reaction after a certain period of time, or bydetermining the amount of substrate remaining in the reaction mixtureafter a certain period of time. The activity of the enzyme is alsomeasured by determining the amount of an unused co-factor of thereaction remaining in the reaction mixture after a certain period oftime or by determining the amount of used co-factor in the reactionmixture after a certain period of time. The activity of the enzyme isalso measured by determining the amount of a donor of free energy orenergy-rich molecule (e.g. ATP, phosphoenolpyruvate, acetyl phosphate orphosphocreatine) remaining in the reaction mixture after a certainperiod of time or by determining the amount of a used donor of freeenergy or energy-rich molecule (e.g. ADP, pyruvate, acetate or creatine)in the reaction mixture after a certain period of time.

[0035] Herbicide: a chemical substance used to kill or suppress thegrowth of plants, plant cells, plant seeds, or plant tissues.

[0036] Heterologous DNA Sequence: a DNA sequence not naturallyassociated with a host cell into which it is introduced, includingnon-naturally occurring multiple copies of a naturally occurring DNAsequence; and genetic constructs wherein an otherwise homologous DNAsequence is operatively linked to a non-native sequence.

[0037] Homologous DNA Sequence: a DNA sequence naturally associated witha host cell into which it is introduced.

[0038] Inhibitor: a chemical substance that causes abnormal growth,e.g., by inactivating the enzymatic activity of a protein such as abiosynthetic enzyme, receptor, signal transduction protein, structuralgene product, or transport protein that is essential to the growth orsurvival of the plant. In the context of the instant invention, aninhibitor is a chemical substance that alters the enzymatic activityencoded by a nucleotide sequence of the present invention. Moregenerally, an inhibitor causes abnormal growth of a host cell byinteracting with the gene product encoded by the nucleotide sequence ofthe present invention.

[0039] Isogenic: plants which are genetically identical, except thatthey may differ by the presence or absence of a heterologous DNAsequence.

[0040] Isolated: in the context of the present invention, an isolatedDNA molecule or an isolated enzyme is a DNA molecule or enzyme that, bythe hand of man, exists apart from its native environment and istherefore not a product of nature. An isolated DNA molecule or enzymemay exist in a purified form or may exist in a non-native environmentsuch as, for example, in a transgenic host cell.

[0041] Mature protein: protein which is normally targeted to a cellularorganelle, such as a chloroplast, and from which the transit peptide hasbeen removed.

[0042] Minimal Promoter: promoter elements, particularly a TATA element,that are inactive or that have greatly reduced promoter activity in theabsence of upstream activation. In the presence of a suitabletranscription factor, the minimal promoter functions to permittranscription.

[0043] Modified Enzyme Activity: enzyme activity different from thatwhich naturally occurs in a plant (i.e. enzyme activity that occursnaturally in the absence of direct or indirect manipulation of suchactivity by man), which is tolerant to inhibitors that inhibit thenaturally occurring enzyme activity.

[0044] Pre-protein: protein which is normally targeted to a cellularorganelle, such as a chloroplast, and still comprising its transitpeptide.

[0045] Significant Increase: an increase in enzymatic activity that islarger than the margin of error inherent in the measurement technique,preferably an increase by about 2-fold or greater of the activity of thewild-type enzyme in the presence of the inhibitor, more preferably anincrease by about 5-fold or greater, and most preferably an increase byabout 10-fold or greater.

[0046] Significantly less: means that the amount of a product of anenzymatic reaction is reduced by more than the margin of error inherentin the measurement technique, preferably a decrease by about 2-fold orgreater of the activity of the wild-type enzyme in the absence of theinhibitor, more preferably an decrease by about 5-fold or greater, andmost preferably an decrease by about 10-fold or greater.

[0047] Substantially similar: with respect to a gene of the presentinvention, in its broadest sense, the term “substantially similar”, whenused herein with respect to a nucleotide sequence, means a nucleotidesequence corresponding to a reference nucleotide sequence, wherein thecorresponding sequence encodes a polypeptide having substantially thesame structure and function as the polypeptide encoded by the referencenucleotide sequence, e.g. where only changes in amino acids notaffecting the polypeptide function occur. Desirably the substantiallysimilar nucleotide sequence encodes the polypeptide encoded by thereference nucleotide sequence. The term “substantially similar” isspecifically intended to include nucleotide sequences wherein thesequence has been modified to optimize expression in particular cells. Anucleotide sequence “substantially similar” to a reference nucleotidesequence has a complement that hybridizes to the reference nucleotidesequence in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at50° C. with washing in 2×SSC, 0.1% SDS at 50° C., more desirably in 7%sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. withwashing in 1×SSC, 0.1% SDS at 50° C., more desirably still in 7% sodiumdodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in0.5×SSC, 0.1% SDS at 50° C., preferably in 7% sodium dodecyl sulfate(SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in 0.1×SSC, 0.1%SDS at 50° C., more preferably in 7% sodium dodecyl sulfate (SDS), 0.5 MNaPO₄, 1 mM EDTA at 50° C. with washing in 0.1×SSC, 0.1% SDS at 65° C.As used herein the term “1917 gene” refers to a DNA molecule comprisingSEQ ID NO: 1 or comprising a nucleotide sequence substantially similarto SEQ ID NO: 1. As used herein the term “2092 gene” refers to a DNAmolecule comprising SEQ ID NO:3 or comprising a nucleotide sequencesubstantially similar to SEQ ID NO:3. As used herein the term “7724gene” refers to a DNA molecule comprising SEQ ID NO:5 or comprising anucleotide sequence substantially similar to SEQ ID NO:5.

[0048] With respect to a protein of the present invention, the term“substantially similar”, when used herein with respect to a protein,means a protein corresponding to a reference protein, wherein theprotein has substantially the same structure and function as thereference protein, e.g. where only changes in amino acids sequence notaffecting the polypeptide function occur.

[0049] One skilled in the art is also familiar with analysis tools, suchas GAP analysis, to determine the percentage of identity between the“substantially similar” and the reference nucleotide sequence, orprotein or amino acid sequence. In the present invention, “substantiallysimilar” is therefore also determined using default GAP analysisparameters with the University of Wisconsin GCG, SEQWEB application ofGAP, based on the algorithm of Needleman and Wunsch (Needleman andWunsch (1970) J Mol. Biol. 48: 443-453).

[0050] Thus, in the context of the “1917 gene” and using GAP analysis asdescribed above, “substantially similar” refers to nucleotide sequencesthat encode a protein having at least 48% identity, more preferably atleast 50% identity, still more preferably at least 65% identity, stillmore preferably at least 75% identity, still more preferably at least85% identity, still more preferably at least 95% identity, yet stillmore preferably at least 99% identity to SEQ ID NO:2. Further, using GAPanalysis as described above, “homologs of the 1917 gene” includenucleotide sequences that encode an amino acid sequence that has atleast 30% identity to SEQ ID NO:2, more preferably at least 40%identity, still more preferably at least 45% identity, still morepreferably at least 55% identity, yet still more preferably at least 65%identity, still more preferably at least 75% identity, yet still morepreferably at least 85% identity to SEQ ID NO:2, wherein the amino acidsequence encoded by the homolog has the biological activity of the 1917protein.

[0051] When using GAP analysis as described above with respect to aprotein or an amino acid sequence and in the context of the “1917 gene”,the percentage of identity between the “substantially similar” proteinor amino acid sequence and the reference protein or amino acid sequence(in this case SEQ ID NO:2) is at least 48%, more preferably at least50%, still more preferably at least 65%, still more preferably at least75%, still more preferably at least 85%, still more preferably at least95%, yet still more preferably at least 99%.

[0052] “Homologs of the 1917 protein” include amino acid sequences thatare at least 30% identical to SEQ ID NO:2, more preferably at least 40%identical, still more preferably at least 45% identical, still morepreferably at least 55% identical, yet still more preferably at least65% identical, still more preferably at least 75% identical, yet stillmore preferably at least 85% identical to SEQ ID NO:2, wherein homologsof the 1917 protein have the biological activity of the 1917 protein.

[0053] Thus, in the context of the “2092 gene” and using GAP analysis asdescribed above, “substantially similar” refers to nucleotide sequencesthat encode a protein having at least 58% identity, more preferably atleast 65% identity, still more preferably at least 75% identity, stillmore preferably at least 85% identity, still more preferably at least95% identity, yet still more preferably at least 99% identity to SEQ IDNO:4. Further, using GAP analysis as described above, “homologs of the2092 gene” include nucleotide sequences that encode an amino acidsequence that has at least 34% identity to SEQ ID NO:4, more preferablyat least 40% identity, still more preferably at least 50% identity,still more preferably at least 60% identity, yet still more preferablyat least 65% identity, still more preferably at least 75% identity, yetstill more preferably at least 85% identity to SEQ ID NO:4, wherein theamino acid sequence encoded by the homolog has the biological activityof the 2092 protein.

[0054] When using GAP analysis as described above with respect to aprotein or an amino acid sequence and in the context of the “2092 gene”,the percentage of identity between the “substantially similar” proteinor amino acid sequence and the reference protein or amino acid sequence(in this case SEQ ID NO:4) is at least 58%, more preferably at least65%, still more preferably at least 75%, still more preferably at least85%, still more preferably at least 95%, yet still more preferably atleast 99%.

[0055] “Homologs of the 2092 protein” include amino acid sequences thatare at least 34% identical to SEQ ID NO:4, more preferably at least 50%identical, still more preferably at least 55% identical, still morepreferably at least 60% identical, yet still more preferably at least65% identical, still more preferably at least 75% identical, yet stillmore preferably at least 85% identical to SEQ ID NO:4, wherein homologsof the 2092 protein have the biological activity of the 2092 protein.

[0056] Thus, in the context of the “7724 gene” and using GAP analysis asdescribed above, “substantially similar” refers to nucleotide sequencesthat encode a protein having at least 36% identity, more preferably atleast 50% identity, more preferably at least 70% identity, morepreferably at least 90% identity, still more preferably at least 99%identity to SEQ ID NO:6. Further, using GAP analysis as described above,“homologs of the 7724 gene” include nucleotide sequences that encode anamino acid sequence that has at least 30% identity to SEQ ID NO:6, morepreferably at least 40% identity, still more preferably at least 50%identity, still more preferably at least 60% identity, yet still morepreferably at least 70% identity, still more preferably at least 85%identity, yet still more preferably at least 90% identity to SEQ IDNO:6, wherein the amino acid sequence encoded by the homolog has thebiological activity of the 7724 protein.

[0057] When using GAP analysis as described above with respect to aprotein or an amino acid sequence and in the context of the “7724 gene”,the percentage of identity between the “substantially similar” proteinor amino acid sequence and the reference protein or amino acid sequence(in this case SEQ ID NO:6) is at least 36%, more preferably at least 50%identity, more preferably at least 70% identity, more preferably atleast 90% identity, still more preferably at least 99%.

[0058] “Homologs of the 7724 protein” include amino acid sequences thatare at least 30% identical to SEQ ID NO:6, more preferably at least 40%identical, still more preferably at least 50% identical, still morepreferably at least 60% identical, yet still more preferably at least70% identical, still more preferably at least 85% identical, yet stillmore preferably at least 95% identical to SEQ ID NO:6, wherein homologsof the 7724 protein have the biological activity of the 7724 protein.

[0059] Substrate: a substrate is the molecule that an enzyme naturallyrecognizes and converts to a product in the biochemical pathway in whichthe enzyme naturally carries out its function, or is a modified versionof the molecule, which is also recognized by the enzyme and is convertedby the enzyme to a product in an enzymatic reaction similar to thenaturally-occurring reaction.

[0060] Tolerance: the ability to continue essentially normal growth orfunction when exposed to an inhibitor or herbicide in an amountsufficient to suppress the normal growth or function of native,unmodified plants.

[0061] Transformation: a process for introducing heterologous DNA into acell, tissue, or plant. Transformed cells, tissues, or plants areunderstood to encompass not only the end product of a transformationprocess, but also transgenic progeny thereof.

[0062] Transgenic: stably transformed with a recombinant DNA moleculethat preferably comprises a suitable promoter operatively linked to aDNA sequence of interest.

BRIEF DESCRIPTION OF THE SEQUENCES IN THE SEQUENCE LISTING

[0063] SEQ ID NO:1 cDNA coding sequence for isoform II of theArabidopsis thaliana 1917 gene

[0064] SEQ ID NO:2 amino acid sequence encoded by isoform II of theArabidopsis thaliana 1917 DNA sequence shown in SEQ ID NO:1

[0065] SEQ ID NO:3 cDNA coding sequence for the Arabidopsis thaliana2092 gene

[0066] SEQ ID NO:4 amino acid sequence encoded by the Arabidopsisthaliana 2092 cDNA sequence shown in SEQ ID NO:3

[0067] SEQ ID NO:5 cDNA coding sequence for the Arabidopsis thaliana7724 gene

[0068] SEQ ID NO:6 amino acid sequence encoded by the Arabidopsisthaliana 7724 DNA sequence shown in SEQ ID NO:5

[0069] SEQ ID NO:7 complete cDNA coding sequence, including 5′ UTR,coding region, and 3′ UTR sequences, for the Arabidopsis thaliana 2092gene

[0070] SEQ ID NO:8 amino acid sequence encoded by the Arabidopsisthaliana 2092 DNA sequence shown in SEQ ID NO:7

[0071] SEQ ID NO:9 oligonucleotide CA50

[0072] SEQ ID NO:10 oligonucleotide CA51

[0073] SEQ ID NO:11 oligonucleotide CA52

[0074] SEQ ID NO:12 oligonucleotide CA53

[0075] SEQ ID NO:13 oligonucleotide CA54

[0076] SEQ ID NO:14 oligonucleotide CA55

[0077] SEQ ID NO:15 oligonucleotide CA66

[0078] SEQ ID NO:16 oligonucleotide CA67

[0079] SEQ ID NO:17 oligonucleotide CA68

[0080] SEQ ID NO:18 oligonucleotide JM33

[0081] SEQ ID NO:19 oligonucleotide JM34

[0082] SEQ ID NO:20 oligonucleotide JM35

[0083] SEQ ID NO:21 complete cDNA coding sequence, including 5′ UTR,coding region, and 3′ UTR sequences, for the Arabidopsis thaliana 7724gene

[0084] SEQ ID NO:22 amino acid sequence encoded by the Arabidopsisthaliana 7724 DNA sequence shown in SEQ ID NO:21

[0085] SEQ ID NO:23 genomic sequence of the Arabidopsis thaliana 7724gene

[0086] SEQ ID NO:24 cDNA coding sequence for isoform I of theArabidopsis thaliana 1917 gene

[0087] SEQ ID NO:25 amino acid sequence encoded by isoform I of theArabidopsis thaliana 1917 DNA sequence shown in SEQ ID NO:24

[0088] SEQ ID NO:26 oligonucleotide slp346

[0089] SEQ ID NO:27 oligonucleotide JM99

[0090] SEQ ID NO:28 oligonucleotide JM100

DETAILED DESCRIPTION OF THE INVENTION

[0091] I.a. Essentiality of the 1917, 2092, and 7724 Genes inArabidopsis thaliana Demonstrated by T-DNA Insertion Mutagenesis

[0092] As shown in the examples below, the identification of a novelgene structure, as well as the essentiality of the 1917, 2092, and 7724genes for normal plant growth and development, have been demonstratedfor the first time in Arabidopsis using T-DNA insertion mutagenesis.Having established the essentiality of 1917, 2092, and 7724 function inplants and having identified the genes encoding these essentialactivities, the inventors thereby provide an important and sought aftertool for new herbicide development.

[0093] Essential genes are identified through the isolation of lethalmutants blocked in early development. Examples of lethal mutants includethose blocked in the formation of the male or female gametes or embryo.Gametophytic mutants are found by examining T1 insertion lines for thepresence of 50% aborted pollen grains or ovules. Embryo defectivemutants produce 25% defective seeds following self-pollination of T1plants (see Errampalli et al. 1991, Plant Cell 3:149-157; Castle et al.1993, Mol Gen Genet 241:504-514).

[0094] When a line is identified as segregating for an embryo lethalmutation, it is determined if the resistance marker in the T-DNAco-segregates with the lethality (Errampalli et al. (1991) The PlantCell, 3:149-157). Cosegregation analysis is done by placing the seeds onmedia containing the selective agent and scoring the seedlings forresistance or sensitivity to the agent. Examples of selective agentsused are hygromycin or phosphinothricin. About (these are the actualnumbers) 17 (1917), 35 (2092), and 37 (7724) resistant seedlings aretransplanted to soil and their progeny are examined for the segregationof the embryo-lethal phenotype. In the case in which the T-DNA insertiondisrupts an essential gene, there is cosegregation of the resistancephenotype and the embryo-lethal phenotype in every plant. Therefore, insuch a case, all resistant plants segregate for the lethal phenotype inthe next generation; this result indicates that each of the resistantplants is heterozygous for the mutation and hemizygous for the T-DNAinsert causing the mutation. For those lines showing cosegregation ofthe T-DNA resistance marker and the lethal phenotype, PCR-basedapproaches, such as TAIL PCR (Liu and Whittier (1995), Genomics, 25:674-681) vectorette PCR (Riley et al. (1990) Nucleic Acids Research, 18:2887-2890), or a strategy such as the Genome Walker system (CLONTECHLaboratories, Inc, Palo Alto, Calif.), may be used to directly amplifyplant DNA/T-DNA border fragments. Each of these techniques takesadvantage of the fact that the DNA sequence of the insertion element isknown, and can routinely be used to recover small (less than 5 kb)fragments adjacent to the known sequence. Alternatively, plasmid rescuemay be used to isolate the plant DNA/T-DNA border fragments. Southernblot analysis may be performed as an initial step in thecharacterization of the molecular nature of each insertion. Southernblots are done with genomic DNA isolated from heterozygotes and usingprobes capable of hybridizing with the T-DNA vector DNA.

[0095] Using the results of the Southern analysis, appropriaterestriction enzymes are chosen to perform plasmid rescue in order tomolecularly clone Arabidopsis thaliana genomic DNA flanking one or bothsides of the T-DNA insertion. Plasmids obtained in this manner areanalyzed by restriction enzyme digestion to sort the plasmids intoclasses based on their digestion pattern. For each class of plasmidclone, the DNA sequence is determined.

[0096] The resulting sequences, obtained by any of the above outlinedapproaches, are analyzed for the presence of non-T-DNA vector sequences.When such sequences are found, they are used to search DNA and proteindatabases using the BLAST and BLAST2 programs (Altschul et al. (1990) JMol. Biol. 215: 403-410; Altschul et al (1997) Nucleic Acid Res.25:3389-3402). Additional genomic and cDNA sequences for each gene areidentified by standard molecular biology procedures.

[0097] One method of confirming that the disrupted gene is the cause ofthe mutant phenotype is to transform a wild-type form of the gene intothe mutant plant. Another method is identification of a second mutantallele showing a lethal phenotype. Alternatively, the mutant isphenocopied by specifically reducing expression of the disrupted gene intransgenic plants expressing an antisense version of the gene behind asynthetic promoter (Guyer et al. (1998) Genetics, 149: 633-639).

[0098] II. Sequence of the Arabidopsis 1917, 2092, and 7724 Gene

[0099] The Arabidopsis 1917 gene is identified by isolating DNA flankingthe T-DNA border from the tagged embryo-lethal line # 1917. ArabidopsisDNA flanking the T-DNA border is identical to regions of two sequencedEST clones from Arabidopsis (Genbank accession numbers H77096 andR30603). The inventors are the first to demonstrate that the 1917 geneproduct is essential for normal growth and development in plants, aswell as defining the function of the 1917 gene product through proteinhomology. The present invention discloses the cDNA nucleotide sequenceof the Arabidopsis 1917 gene as well as the amino acid sequence of theArabidopsis 1917 protein. The nucleotide sequence corresponding to thecDNA coding region is set forth in SEQ ID NO:1, and the amino acidsequence encoding the protein is set forth in SEQ ID NO:2. Thenucleotide sequence corresponding to the complete cDNA, which includes5′ UTR and coding and 3′ UTR sequences, is set forth in SEQ ID NO:24.The present invention also encompasses an isolated amino acid sequencederived from a plant, wherein said amino acid sequence is identical orsubstantially similar to the amino acid sequence encoded by thenucleotide sequence set forth in SEQ ID NO: 1, wherein said amino acidsequence has 1917 activity. Using GAP programs with the defaultsettings, the sequence of the 1917 gene shows similarity to arginyl tRNAsynthetase. Notable species similarities include: chinese hamster(Genbank peptide accession # P37880); human (Genbank peptide accession#NP_(—)002878.1); Synechocystis (Genbank peptide accession # Q55486); C.elegans (Genbank peptide accession # Q19825); Chlamydia sp. (Genbankpeptide accession # AE001641); Streptomyces sp. (Genbank peptideaccession # AL079345); Haemophilus (Genbank peptide accession # P43832);E. coli (Genbank peptide accession # P11875); S. cerevisiae (Genbankpeptide accession # NP _(—)010628.1); and S. pombe (Genbank peptideaccession # AL031853).

[0100] The Arabidopsis 2092 gene is identified by isolating DNA flankingthe T-DNA border from the tagged embryo-lethal line # 2092. ArabidopsisDNA flanking the T-DNA border is identical to a sequenced P1 clone MRN17(GenBank accession # AB005243). The inventors are the first todemonstrate that the 2092 gene product is essential for normal growthand development in plants, as well as defining the function of the 2092gene product through protein homology. The present invention disclosesthe cDNA nucleotide sequence of the Arabidopsis 2092 gene as well as theamino acid sequence of the Arabidopsis 2092 protein. The nucleotidesequence corresponding to the cDNA coding region is set forth in SEQ IDNO:3, and the amino acid sequence encoding the protein is set forth inSEQ ID NO:4. The present invention also encompasses an isolated aminoacid sequence derived from a plant, wherein said amino acid sequence isidentical or substantially similar to the amino acid sequence encoded bythe nucleotide sequence set forth in SEQ ID NO: 4, wherein said aminoacid sequence has 2092 activity. Using GAP programs with the defaultsettings, the sequence of the 2092 gene shows similarity to alanyl tRNAsynthetase genes. Notable species similarities include: Synechocystis(Genbank peptide accession # G2500959); E. coli (Genbank peptideaccession # AE000353); yeast (Genbank peptide accession # NP_(—)014980);Drosophila (Genbank peptide accession # AF188718);, and human (Genbankpeptide accession # AB033096).

[0101] The Arabidopsis 7724 gene is identified by isolating DNA flankingthe T-DNA border from the tagged embryo-lethal line #7724. ArabidopsisDNA flanking the T-DNA border is identical to a portion of sequence tothe BAC clone F4L23 (Genbank accession # AC002387). Annotation suggeststhat a gene is present in the region disrupted by the T-DNA. BLAST-Nsearches using default settings, using the annotated gene region revealspublic EST clones with sequence identity to the predicted gene,indicating that this region contains an expressed gene. The EST clonesare 10409T7 and 10409XP (different ends of the same clone). Theinventors are the first to demonstrate that the 7724 gene product isessential for normal growth and development in plants, as well asdefining the function of the 7724 gene product through protein homology.The present invention discloses the cDNA nucleotide sequence of theArabidopsis 7724 gene as well as the amino acid sequence of theArabidopsis 7724 protein. The nucleotide sequence corresponding to thecDNA coding region is set forth in SEQ ID NO:5, and the amino acidsequence encoding the protein is set forth in SEQ ID NO:6. The presentinvention also encompasses an isolated amino acid sequence derived froma plant, wherein said amino acid sequence is identical or substantiallysimilar to the amino acid sequence encoded by the nucleotide sequenceset forth in SEQ ID NO: 5, wherein said amino acid sequence has 7724activity. Using GAP programs with the default settings, the sequence ofthe 7724 gene shows similarity to 2′ tRNA phosphotransferase genes.Notable species similarities include: S. cerevisiae (Genbank peptideaccession # NP_(—)014539); Streptomyces coelicolor (Genbank peptideaccession # CAA22225); S. pombe (Genbank peptide accession # CAB16372);Pyrococcus horikoshii (Genbank peptide accession # BAA29229); andArchaeoglobus fulgidus (Genbank peptide accession number AAB90829).

[0102] III. Recombinant Production of 1917, 2092, and 7724 Activitiesand Uses Thereof

[0103] For recombinant production of 1917, 2092, or 7724 activity in ahost organism, a nucleotide sequence encoding a protein having one ofthe above activities is inserted into an expression cassette designedfor the chosen host and introduced into the host where it isrecombinantly produced. For example, SEQ ID NO:1, or nucleotidesequences substantially similar to SEQ ID NO:1, or homologs of the 1917coding sequence can be used for the recombinant production of a proteinhaving 1917 activity. For example, SEQ ID NO:3, or nucleotide sequencessubstantially similar to SEQ ID NO:3, or homologs of the 2092 codingsequence can be used for the recombinant production of a protein having2092 activity. For example, SEQ ID NO:5, or nucleotide sequencessubstantially similar to SEQ ID NO:5, or homologs of the 7724 codingsequence can be used for the recombinant production of a protein having7724 activity. The choice of specific regulatory sequences such aspromoter, signal sequence, 5′ and 3′ untranslated sequences, andenhancer appropriate for the chosen host is within the level of skill ofthe routineer in the art. The resultant molecule, containing theindividual elements operably linked in proper reading frame, may beinserted into a vector capable of being transformed into the host cell.Suitable expression vectors and methods for recombinant production ofproteins are well known for host organisms such as E. coli, yeast, andinsect cells (see, e.g., Luckow and Summers, Bio/Technol. 6: 47 (1988),and baculovirus expression vectors, e.g., those derived from the genomeof Autographica californica nuclear polyhedrosis virus (AcMNPV). Apreferred baculovirus/insect system is pAcHLT (Pharmingen, San Diego,Calif.) used to transfect Spodoptera frugiperda Sf9 cells (ATCC) in thepresence of linear Autographa californica baculovirus DNA (Pharmigen,San Diego, Calif.). The resulting virus is used to infect HighFiveTricoplusia ni cells (Invitrogen, La Jolla, Calif.).

[0104] In a preferred embodiment, the nucleotide sequence encoding aprotein having 1917, 2092, or 7724 activity is derived from a eukaryote,such as a mammal, a fly or a yeast, but is preferably derived from aplant. In a further preferred embodiment, the nucleotide sequence isidentical or substantially similar to the nucleotide sequence set forthin SEQ ID NO: 1, SEQ ID NO:3, or SEQ ID NO:5, respectively, or encodes aprotein having 1917, 2092, or 7724 activity, respectively, whose aminoacid sequence is identical or substantially similar to the amino acidsequence set forth in SEQ ID NO:2, SEQ ID NO:4, or SEQ ID NO:6,respectively. The nucleotide sequence set forth in SEQ ID NO:1 encodesthe Arabidopsis 1917 protein, whose amino acid sequence is set forth inSEQ ID NO:2. The nucleotide sequence set forth in SEQ ID NO:3 encodesthe Arabidopsis 2092 protein, whose amino acid sequence is set forth inSEQ ID NO:4. The nucleotide sequence set forth in SEQ ID NO:5 encodesthe Arabidopsis 7724 protein, whose amino acid sequence is set forth inSEQ ID NO:6. In another preferred embodiment, the nucleotide sequencesare derived from a prokaryote, preferably a bacteria, e.g. E. coli.Recombinantly produced protein having 1917, 2092, or 7724 activity isisolated and purified using a variety of standard techniques. The actualtechniques that may be used will vary depending upon the host organismused, whether the protein is designed for secretion, and other suchfactors familiar to the skilled artisan (see, e.g. chapter 16 ofAusubel, F. et al., “Current Protocols in Molecular Biology”, pub. byJohn Wiley & Sons, Inc. (1994).

[0105] Assays Utilizing the 1917, 2092, or 7724 Protein

[0106] Recombinantly produced 1917, 2092, or 7724 proteins having 1917,2092, or 7724 activities, respectively, are useful for a variety ofpurposes. For example, they can be used in in vitro assays to screenknown herbicidal chemicals whose target has not been identified todetermine if they inhibit 1917, 2092, or 7724. Such in vitro assays mayalso be used as more general screens to identify chemicals that inhibitsuch enzymatic activity and that are therefore novel herbicidecandidates. Alternatively, recombinantly produced 1917, 2092, or 7724proteins having 1917, 2092, or 7724 activity may be used to elucidatethe complex structure of these molecules and to further characterizetheir association with known inhibitors in order to rationally designnew inhibitory herbicides as well as herbicide tolerant forms of theenzymes.

[0107] In vitro Inhibitor Assays: Discovery of Small Molecule Ligandthat Interacts with the Gene Product of SEQ ID NO:1, SEQ ID NO:3, or SEQID NO:5

[0108] Once a protein has been identified as a potential herbicidetarget, the next step is to develop an assay that allows screening largenumber of chemicals to determine which ones interact with the protein.Although it is straightforward to develop assays for proteins of knownfunction, developing assays with proteins of unknown functions is moredifficult.

[0109] This difficulty can be overcome by using technologies that candetect interactions between a protein and a compound without knowing thebiological function of the protein. A short description of three methodsis presented, including fluorescence correlation spectroscopy,surface-enhanced laser desorption/ionization, and biacore technologies.

[0110] Fluorescence Correlation Spectroscopy (FCS) theory was developedin 1972 but it is only in recent years that the technology to performFCS became available (Madge et al. (1972) Phys. Rev. Lett., 29: 705-708;Maiti et al. (1997) Proc. Natl. Acad. Sci. USA, 94: 11753-1175). FCSmeasures the average diffusion rate of a fluorescent molecule within asmall sample volume. The sample size can be as low as 10³ fluorescentmolecules and the sample volume as low as the cytoplasm of a singlebacterium. The diffusion rate is a function of the mass of the moleculeand decreases as the mass increases. FCS can therefore be applied toprotein-ligand interaction analysis by measuring the change in mass andtherefore in diffusion rate of a molecule upon binding. In a typicalexperiment, the target to be analyzed is expressed as a recombinantprotein with a sequence tag, such as a poly-histidine sequence, insertedat the N or C-terminus. The expression takes place in E. coli, yeast orinsect cells. The protein is purified by chromatography. For example,the poly-histidine tag can be used to bind the expressed protein to ametal chelate column such as Ni2+ chelated on iminodiacetic acidagarose. The protein is then labeled with a fluorescent tag such ascarboxytetramethylrhodamine or BODIPY® (Molecular Probes, Eugene,Oreg.). The protein is then exposed in solution to the potential ligand,and its diffusion rate is determined by FCS using instrumentationavailable from Carl Zeiss, Inc. (Thornwood, N.Y.). Ligand binding isdetermined by changes in the diffusion rate of the protein.

[0111] Surface-Enhanced Laser Desorption/Ionization (SELDI) was inventedby Hutchens and Yip during the late 1980's (Hutchens and Yip (1993)Rapid Commun. Mass Spectrom. 7: 576-580). When coupled to atime-of-flight mass spectrometer (TOF), SELDI provides a mean to rapidlyanalyze molecules retained on a chip. It can be applied toligand-protein interaction analysis by covalently binding the targetprotein on the chip and analyze by MS the small molecules that bind tothis protein (Worrall et al. (1998) Anal. Biochem. 70: 750-756). In atypical experiment, the target to be analyzed is expressed as describedfor FCS. The purified protein is then used in the assay without furtherpreparation. It is bound to the SELDI chip either by utilizing thepoly-histidine tag or by other interaction such as ion exchange orhydrophobic interaction. The chip thus prepared is then exposed to thepotential ligand via, for example, a delivery system capable to pipetthe ligands in a sequential manner (autosampler). The chip is thensubmitted to washes of increasing stringency, for example a series ofwashes with buffer solutions containing an increasing ionic strength.After each wash, the bound material is analyzed by submitting the chipto SELDI-TOF. Ligands that specifically bind the target will beidentified by the stringency of the wash needed to elute them.

[0112] Biacore relies on changes in the refractive index at the surfacelayer upon binding of a ligand to a protein immobilized on the layer. Inthis system, a collection of small ligands is injected sequentially in a2-5 microliter cell with the immobilized protein. Binding is detected bysurface plasmon resonance (SPR) by recording laser light refracting fromthe surface. In general, the refractive index change for a given changeof mass concentration at the surface layer, is practically the same forall proteins and peptides, allowing a single method to be applicable forany protein (Liedberg et al. (1983) Sensors Actuators 4: 299-304;Malmquist (1993) Nature, 361: 186-187). In a typical experiment, thetarget to be analyzed is expressed as described for FCS. The purifiedprotein is then used in the assay without further preparation. It isbound to the Biacore chip either by utilizing the poly-histidine tag orby other interaction such as ion exchange or hydrophobic interaction.The chip thus prepared is then exposed to the potential ligand via thedelivery system incorporated in the instruments sold by Biacore(Uppsala, Sweden) to pipet the ligands in a sequential manner(autosampler). The SPR signal on the chip is recorded and changes in therefractive index indicate an interaction between the immobilized targetand the ligand. Analysis of the signal kinetics on rate and off rateallows the discrimination between non-specific and specific interaction.

[0113] IV. In vivo Inhibitor Assay

[0114] In one embodiment, a suspected herbicide, for example identifiedby in vitro screening, is applied to plants at various concentrations.The suspected herbicide is preferably sprayed on the plants. Afterapplication of the suspected herbicide, its effect on the plants, forexample death or suppression of growth is recorded.

[0115] In another embodiment, an in vivo screening assay for inhibitorsof the 1917, 2092, or 7724 activity uses transgenic plants, planttissue, plant seeds or plant cells capable of overexpressing anucleotide sequence having 1917, 2092, or 7724 activity, wherein the1917, 2092, or 7724 gene product is enzymatically active in thetransgenic plants, plant tissue, plant seeds or plant cells. Thenucleotide sequence is preferably derived from an eukaryote, such as ayeast, but is preferably derived from a plant. In a further preferredembodiment, the nucleotide sequence is identical or substantiallysimilar to the nucleotide sequence set forth in SEQ ID NO:1, or encodesan enzyme having 1917 activity, whose amino acid sequence is identicalor substantially similar to the amino acid sequence set forth in SEQ IDNO:2. In a further preferred embodiment, the nucleotide sequence isidentical or substantially similar to the nucleotide sequence set forthin SEQ ID NO:3, or encodes an enzyme having 2092 activity, whose aminoacid sequence is identical or substantially similar to the amino acidsequence set forth in SEQ ID NO:4. In a further preferred embodiment,the nucleotide sequence is identical or substantially similar to thenucleotide sequence set forth in SEQ ID NO:5, or encodes an enzymehaving 7724 activity, whose amino acid sequence is identical orsubstantially similar to the amino acid sequence set forth in SEQ IDNO:6. In another preferred embodiment, the nucleotide sequence isderived from a prokaryote, preferably a bacteria, e.g. E. coli.

[0116] A chemical is then applied to the transgenic plants, planttissue, plant seeds or plant cells and to the isogenic non-transgenicplants, plant tissue, plant seeds or plant cells, and the growth orviability of the transgenic and non-transformed plants, plant tissue,plant seeds or plant cells are determined after application of thechemical and compared. Compounds capable of inhibiting the growth of thenon-transgenic plants, but not affecting the growth of the transgenicplants are selected as specific inhibitors of 1917, 2092, or 7724activity.

[0117] V. Herbicide Tolerant Plants

[0118] The present invention is further directed to plants, planttissue, plant seeds, and plant cells tolerant to herbicides that inhibitthe naturally occurring 1917, 2092, or 7724 activity in these plants,wherein the tolerance is conferred by an altered 1917, 2092, or 7724activity. Altered 1917, 2092, or 7724 activity may be conferred upon aplant according to the invention by increasing expression of wild-typeherbicide-sensitive 1917, 2092, or 7724 gene, for example by providingadditional wild-type 1917, 2092, or 7724 genes and/or by overexpressingthe endogenous 1917, 2092, or 7724 gene, for example by drivingexpression with a strong promoter. Altered 1917, 2092, or 7724 activityalso may be accomplished by expressing nucleotide sequences that aresubstantially similar to SEQ ID NO: 1, SEQ ID NO:3, or SEQ ID NO:5,respectively, or homologs in a plant. Still further altered 1917, 2092,or 7724 activity is conferred on a plant by expressing modifiedherbicide-tolerant 1917, 2092, or 7724 genes in the plant. Combinationsof these techniques may also be used. Representative plants include anyplants to which these herbicides are applied for their normally intendedpurpose. Preferred are agronomically important crops such as cotton,soybean, oilseed rape, sugar beet, maize, rice, wheat, barley, oats,rye, sorghum, millet, turf, forage, turf grasses, and the like.

[0119] A. Increased Expression of Wild-Type 1917, 2092, or 7724

[0120] Achieving altered 1917, 2092, or 7724 activity through increasedexpression results in a level of 1917, 2092, or 7724 activity in theplant cell at least sufficient to overcome growth inhibition caused bythe herbicide when applied in amounts sufficient to inhibit normalgrowth of control plants. The level of expressed enzyme generally is atleast two times, preferably at least five times, and more preferably atleast ten times the natively expressed amount. Increased expression maybe due to multiple copies of a wild-type 1917, 2092, or 7724 gene;multiple occurrences of the coding sequence within the gene (i.e. geneamplification) or a mutation in the non-coding, regulatory sequence ofthe endogenous gene in the plant cell. Plants having such altered geneactivity can be obtained by direct selection in plants by methods knownin the art (see, e.g. U.S. Pat. Nos. 5,162,602, and 4,761,373, andreferences cited therein). These plants also may be obtained by geneticengineering techniques known in the art. Increased expression of aherbicide-sensitive 1917, 2092, or 7724 gene can also be accomplished bytransforming a plant cell with a recombinant or chimeric DNA moleculecomprising a promoter capable of driving expression of an associatedstructural gene in a plant cell operatively linked to a homologous orheterologous structural gene encoding the 1917, 2092, or 7724 protein ora homolog thereof. Preferably, the transformation is stable, therebyproviding a heritable transgenic trait.

[0121] B. Expression of Modified Herbicide-Tolerant 1917, 2092, or 7724Proteins

[0122] According to this embodiment, plants, plant tissue, plant seeds,or plant cells are stably transformed with a recombinant DNA moleculecomprising a suitable promoter functional in plants operatively linkedto a coding sequence encoding a herbicide tolerant form of the 1917,2092, or 7724 protein. A herbicide tolerant form of the enzyme has atleast one amino acid substitution, addition or deletion that conferstolerance to a herbicide that inhibits the unmodified, naturallyoccurring form of the enzyme. The transgenic plants, plant tissue, plantseeds, or plant cells thus created are then selected by conventionalselection techniques, whereby herbicide tolerant lines are isolated,characterized, and developed. Below are described methods for obtaininggenes that encode herbicide tolerant forms of 1917, 2092, or 7724protein.

[0123] One general strategy involves direct or indirect mutagenesisprocedures on microbes. For instance, a genetically manipulatablemicrobe such as E. coli or S. cerevisiae may be subjected to randommutagenesis in vivo with mutagens such as UV light or ethyl or methylmethane sulfonate. Mutagenesis procedures are described, for example, inMiller, Experiments in Molecular Genetics, Cold Spring HarborLaboratory, Cold Spring Harbor, N.Y. (1972); Davis et al., AdvancedBacterial Genetics, Cold Spring Harbor Laboratory, Cold Spring Harbor,N.Y. (1980); Sherman et al., Methods in Yeast Genetics, Cold SpringHarbor Laboratory, Cold Spring Harbor, N.Y. (1983); and U.S. Pat. No.4,975,374. The microbe selected for mutagenesis contains a normal,inhibitor-sensitive 1917, 2092, or 7724 gene and is dependent upon theactivity conferred by this gene. The mutagenized cells are grown in thepresence of the inhibitor at concentrations that inhibit the unmodifiedgene. Colonies of the mutagenized microbe that grow better than theunmutagenized microbe in the presence of the inhibitor (i.e. exhibitresistance to the inhibitor) are selected for further analysis. 1917,2092, or 7724 genes conferring tolerance to the inhibitor are isolatedfrom these colonies, either by cloning or by PCR amplification, andtheir sequences are elucidated. Sequences encoding altered gene productsare then cloned back into the microbe to confirm their ability to conferinhibitor tolerance.

[0124] A method of obtaining mutant herbicide-tolerant alleles of aplant 1917, 2092, or 7724 gene involves direct selection in plants. Forexample, the effect of a mutagenized 1917, 2092, or 7724 gene on thegrowth inhibition of plants such as Arabidopsis, soybean, or maize isdetermined by plating seeds sterilized by art-recognized methods onplates on a simple minimal salts medium containing increasingconcentrations of the inhibitor. Such concentrations are in the range of0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, 3, 10, 30, 110, 300, 1000 and3000 parts per million (ppm). The lowest dose at which significantgrowth inhibition can be reproducibly detected is used for subsequentexperiments. Determination of the lowest dose is routine in the art.

[0125] Mutagenesis of plant material is utilized to increase thefrequency at which resistant alleles occur in the selected population.Mutagenized seed material is derived from a variety of sources,including chemical or physical mutagenesis or seeds, or chemical orphysical mutagenesis or pollen (Neuffer, In Maize for BiologicalResearch Sheridan, ed. Univ. Press, Grand Forks, N.Dak., pp. 61-64(1982)), which is then used to fertilize plants and the resulting M₁mutant seeds collected. Typically for Arabidopsis, M₂ seeds (LehleSeeds, Tucson, Ariz.), which are progeny seeds of plants grown fromseeds mutagenized with chemicals, such as ethyl methane sulfonate, orwith physical agents, such as gamma rays or fast neutrons, are plated atdensities of up to 10,000 seeds/plate (10 cm diameter) on minimal saltsmedium containing an appropriate concentration of inhibitor to selectfor tolerance. Seedlings that continue to grow and remain green 7-21days after plating are transplanted to soil and grown to maturity andseed set. Progeny of these seeds are tested for tolerance to theherbicide. If the tolerance trait is dominant, plants whose seedsegregate 3:1/resistant:sensitive are presumed to have been heterozygousfor the resistance at the M₂ generation. Plants that give rise to allresistant seed are presumed to have been homozygous for the resistanceat the M₂ generation. Such mutagenesis on intact seeds and screening oftheir M₂ progeny seed can also be carried out on other species, forinstance soybean (see, e.g. U.S. Pat. No. 5,084,082). Alternatively,mutant seeds to be screened for herbicide tolerance are obtained as aresult of fertilization with pollen mutagenized by chemical or physicalmeans.

[0126] Confirmation that the genetic basis of the herbicide tolerance isa 1917, 2092, or 7724 gene is ascertained as exemplified below. First,alleles of the 1917, 2092, or 7724 gene from plants exhibitingresistance to the inhibitor are isolated using PCR with primers basedeither upon the Arabidopsis cDNA coding sequences shown in SEQ ID NO:1,SEQ ID NO:3, or SEQ ID NO:5, respectively, or, more preferably, basedupon the unaltered 1917, 2092, or 7724 gene sequence from the plant usedto generate tolerant alleles. After sequencing the alleles to determinethe presence of mutations in the coding sequence, the alleles are testedfor their ability to confer tolerance to the inhibitor on plants intowhich the putative tolerance-conferring alleles have been transformed.These plants can be either Arabidopsis plants or any other plant whosegrowth is susceptible to the 1917, 2092, or 7724 inhibitors. Second, theinserted 1917, 2092, or 7724 genes are mapped relative to knownrestriction fragment length polymorphisms (RFLPs) (See, for example,Chang et al. Proc. Natl. Acad, Sci, USA 85: 6856-6860 (1988); Nam etal., Plant Cell 1: 699-705 (1989), cleaved amplified polymorphicsequences (CAPS) (Konieczny and Ausubel (1993) The Plant Journal, 4(2):403-410), or SSLPs (Bell and Ecker (1994) Genomics, 19: 137-144). The1917, 2092, or 7724 inhibitor tolerance trait is independently mappedusing the same markers. When tolerance is due to a mutation in that1917, 2092, or 7724 gene, the tolerance trait maps to a positionindistinguishable from the position of the 1917, 2092, or 7724 gene.

[0127] Another method of obtaining herbicide-tolerant alleles of a 1917,2092, or 7724 gene is by selection in plant cell cultures. Explants ofplant tissue, e.g. embryos, leaf disks, etc. or actively growing callusor suspension cultures of a plant of interest are grown on medium in thepresence of increasing concentrations of the inhibitory herbicide or ananalogous inhibitor suitable for use in a laboratory environment.Varying degrees of growth are recorded in different cultures. In certaincultures, fast-growing variant colonies arise that continue to grow evenin the presence of normally inhibitory concentrations of inhibitor. Thefrequency with which such faster-growing variants occur can be increasedby treatment with a chemical or physical mutagen before exposing thetissues or cells to the inhibitor. Putative tolerance-conferring allelesof the 1917, 2092, or 7724 gene are isolated and tested as described inthe foregoing paragraphs. Those alleles identified as conferringherbicide tolerance may then be engineered for optimal expression andtransformed into the plant. Alternatively, plants can be regeneratedfrom the tissue or cell cultures containing these alleles.

[0128] Still another method involves mutagenesis of wild-type, herbicidesensitive plant 1917, 2092, or 7724 genes in bacteria or yeast, followedby culturing the microbe on medium that contains inhibitoryconcentrations (i.e. sufficient to cause abnormal growth, inhibit growthor cause cell death) of the inhibitor, and then selecting those coloniesthat grow normally in the presence of the inhibitor. More specifically,a plant cDNA, such as the Arabidopsis cDNA encoding the 1917, 2092, or7724 protein, is cloned into a microbe that otherwise lacks the 1917,2092, or 7724 activity. The transformed microbe is then subjected to invivo mutagenesis or to in vitro mutagenesis by any of several chemicalor enzymatic methods known in the art, e.g. sodium bisulfite (Shortle etal., Methods Enzymol. 100:457-468 (1983); methoxylamine (Kadonaga etal., Nucleic Acids Res. 13:1733-1745 (1985); oligonucleotide-directedsaturation mutagenesis (Hutchinson et al., Proc. Natl. Acad. Sci. USA,83:710-714 (1986); or various polymerase misincorporation strategies(see, e.g. Shortle et al., Proc. Natl. Acad. Sci. USA, 79:1588-1592(1982); Shiraishi et al., Gene 64:313-319 (1988); and Leung et al.,Technique 1:11-15 (1989). Colonies that grow normally in the presence ofnormally inhibitory concentrations of inhibitor are picked and purifiedby repeated restreaking. Their plasmids are purified and tested for theability to confer tolerance to the inhibitor by retransforming them intothe microbe lacking 1917, 2092, or 7724 activity. The DNA sequences ofcDNA inserts from plasmids that pass this test are then determined.

[0129] Herbicide resistant 1917, 2092, or 7724 proteins are alsoobtained using methods involving in vitro recombination, also called DNAshuffling. By DNA shuffling, mutations, preferably random mutations, areintroduced into nucleotide sequences encoding 1917, 2092, or 7724activity. DNA shuffling also leads to the recombination andrearrangement of sequences within a 1917, 2092, or 7724 gene or torecombination and exchange of sequences between two or more different of1917, 2092, or 7724 genes. These methods allow for the production ofmillions of mutated 1917, 2092, or 7724 coding sequences. The mutatedgenes, or shuffled genes, are screened for desirable properties, e.g.improved tolerance to herbicides and for mutations that providebroad-spectrum tolerance to the different classes of inhibitorchemistry. Such screens are well within the skills of a routineer in theart.

[0130] In a preferred embodiment, a mutagenized 1917, 2092, or 7724 geneis formed from at least one template 1917, 2092, or 7724 gene, whereinthe template 1917, 2092, or 7724 gene has been cleaved intodouble-stranded random fragments of a desired size, and comprising thesteps of adding to the resultant population of double-stranded randomfragments one or more single or double-stranded oligonucleotides,wherein said oligonucleotides comprise an area of identity and an areaof heterology to the double-stranded random fragments; denaturing theresultant mixture of double-stranded random fragments andoligonucleotides into single-stranded fragments; incubating theresultant population of single-stranded fragments with a polymeraseunder conditions which result in the annealing of said single-strandedfragments at said areas of identity to form pairs of annealed fragments,said areas of identity being sufficient for one member of a pair toprime replication of the other, thereby forming a mutagenizeddouble-stranded polynucleotide; and repeating the second and third stepsfor at least two further cycles, wherein the resultant mixture in thesecond step of a further cycle includes the mutagenized double-strandedpolynucleotide from the third step of the previous cycle, and thefurther cycle forms a further mutagenized double-strandedpolynucleotide, wherein the mutagenized polynucleotide is a mutated1917, 2092, or 7724 gene having enhanced tolerance to a herbicide whichinhibits naturally occurring 1917, 2092, or 7724 activity. In apreferred embodiment, the concentration of a single species ofdouble-stranded random fragment in the population of double-strandedrandom fragments is less than 1% by weight of the total DNA. In afurther preferred embodiment, the template double-strandedpolynucleotide comprises at least about 100 species of polynucleotides.In another preferred embodiment, the size of the double-stranded randomfragments is from about 5 bp to 5 kb. In a further preferred embodiment,the fourth step of the method comprises repeating the second and thethird steps for at least 10 cycles. Such method is described e.g. inStemmer et al. (1994) Nature 370: 389-391, in U.S. Pat. Nos. 5,605,793,5,811,238 and in Crameri et al. (1998) Nature 391: 288-291, as well asin WO 97/20078, and these references are incorporated herein byreference.

[0131] In another preferred embodiment, any combination of two or moredifferent 1917, 2092, or 7724 genes are mutagenized in vitro by astaggered extension process (StEP), as described e.g. in Zhao et al.(1998) Nature Biotechnology 16: 258-261. The two or more 1917, 2092, or7724 genes are used as template for PCR amplification with the extensioncycles of the PCR reaction preferably carried out at a lower temperaturethan the optimal polymerization temperature of the polymerase. Forexample, when a thermostable polymerase with an optimal temperature ofapproximately 72° C. is used, the temperature for the extension reactionis desirably below 72° C., more desirably below 65° C., preferably below60° C., more preferably the temperature for the extension reaction is55° C. Additionally, the duration of the extension reaction of the PCRcycles is desirably shorter than usually carried out in the art, moredesirably it is less than 30 seconds, preferably it is less than 15seconds, more preferably the duration of the extension reaction is 5seconds. Only a short DNA fragment is polymerized in each extensionreaction, allowing template switch of the extension products between thestarting DNA molecules after each cycle of denaturation and annealing,thereby generating diversity among the extension products. The optimalnumber of cycles in the PCR reaction depends on the length of the 1917,2092, or 7724 genes to be mutagenized but desirably over 40 cycles, moredesirably over 60 cycles, preferably over 80 cycles are used. Optimalextension conditions and the optimal number of PCR cycles for everycombination of 1917, 2092, or 7724 genes are determined as described inusing procedures well-known in the art. The other parameters for the PCRreaction are essentially the same as commonly used in the art. Theprimers for the amplification reaction are preferably designed to annealto DNA sequences located outside of the 1917, 2092, or 7724 genes, e.g.to DNA sequences of a vector comprising the 1917, 2092, or 7724 genes,whereby the different 1917, 2092, or 7724 genes used in the PCR reactionare preferably comprised in separate vectors. The primers desirablyanneal to sequences located less than 500 bp away from 1917, 2092, or7724 sequences, preferably less than 200 bp away from the 1917, 2092, or7724 sequences, more preferably less than 120 bp away from the 1917,2092, or 7724 sequences. Preferably, the 1917, 2092, or 7724 sequencesare surrounded by restriction sites, which are included in the DNAsequence amplified during the PCR reaction, thereby facilitating thecloning of the amplified products into a suitable vector.

[0132] In another preferred embodiment, fragments of 1917, 2092, or 7724genes having cohesive ends are produced as described in WO 98/05765. Thecohesive ends are produced by ligating a first oligonucleotidecorresponding to a part of a 1917, 2092, or 7724 gene to a secondoligonucleotide not present in the gene or corresponding to a part ofthe gene not adjoining to the part of the gene corresponding to thefirst oligonucleotide, wherein the second oligonucleotide contains atleast one ribonucleotide. A double-stranded DNA is produced using thefirst oligonucleotide as template and the second oligonucleotide asprimer. The ribonucleotide is cleaved and removed. The nucleotide(s)located 5′ to the ribonucleotide is also removed, resulting indouble-stranded fragments having cohesive ends. Such fragments arerandomly reassembled by ligation to obtain novel combinations of genesequences.

[0133] In yet another embodiment, herbicide-resistant 1917, 2092, or7724 proteins are produced using the incremental truncation for thecreation of hybrid enzymes (ITCHY), as described in Ostermeier et al.(1999) Nature Biotechnology 17:1205-1209), and this reference isincorporated herein by reference.

[0134] Any 1917, 2092, or 7724 gene or any combination of 1917, 2092, or7724 genes is used for in vitro recombination in the context of thepresent invention, for example, a 1917, 2092, or 7724 gene derived froma plant, such as, e.g. Arabidopsis thaliana, e.g. a 1917, 2092, or 7724gene set forth in SEQ ID NO:1, SEQ ID NO:3, or SEQ ID NO:5,respectively. A 1917-like gene from human (Girjes et al. (1995) Gene,164: 347-350), a 2092-like gene from human (Shiba et al. (1995)Biochemistry, 33: 10340-10349), a 7724-like gene from yeast (Culver etal. (1997) J. Biol. Chemistry, 272: 13203-13210), all of which areincorporated herein by reference. Whole 1917, 2092, or 7724 genes orportions thereof are used in the context of the present invention. Thelibrary of mutated 1917, 2092, or 7724 genes obtained by the methodsdescribed above are cloned into appropriate expression vectors and theresulting vectors are transformed into an appropriate host, for examplean algae like Chlamydomonas, a yeast or a bacteria. An appropriate hostis preferably a host that otherwise lacks 1917, 2092, or 7724 activity,for example E. coli. Host cells transformed with the vectors comprisingthe library of mutated 1917, 2092, or 7724 genes are cultured on mediumthat contains inhibitory concentrations of the inhibitor and thosecolonies that grow in the presence of the inhibitor are selected.Colonies that grow in the presence of normally inhibitory concentrationsof inhibitor are picked and purified by repeated restreaking. Theirplasmids are purified and the DNA sequences of cDNA inserts fromplasmids that pass this test are then determined.

[0135] An assay for identifying a modified 1917, 2092, or 7724 gene thatis tolerant to an inhibitor may be performed in the same manner as theassay to identify inhibitors of the 1917, 2092, or 7724 activity(Inhibitor Assay, above) with the following modifications: First, amutant 1917, 2092, or 7724 protein is substituted in one of the reactionmixtures for the wild-type 1917, 2092, or 7724 protein of the inhibitorassay. Second, an inhibitor of wild-type enzyme is present in bothreaction mixtures. Third, mutated activity (activity in the presence ofinhibitor and mutated enzyme) and unmutated activity (activity in thepresence of inhibitor and wild-type enzyme) are compared to determinewhether a significant increase in enzymatic activity is observed in themutated activity when compared to the unmutated activity. Mutatedactivity is any measure of activity of the mutated enzyme while in thepresence of a suitable substrate and the inhibitor. Unmutated activityis any measure of activity of the wild-type enzyme while in the presenceof a suitable substrate and the inhibitor.

[0136] In addition to being used to create herbicide-tolerant plants,genes encoding herbicide tolerant 1917, 2092, or 7724 protein can alsobe used as selectable markers in plant cell transformation methods. Forexample, plants, plant tissue, plant seeds, or plant cells transformedwith a heterologous DNA sequence can also be transformed with a sequenceencoding an altered 1917, 2092, or 7724 activity capable of beingexpressed by the plant. The transformed cells are transferred to mediumcontaining an inhibitor of the enzyme in an amount sufficient to inhibitthe growth or survivability of plant cells not expressing the modifiedcoding sequence, wherein only the transformed cells will grow. Themethod is applicable to any plant cell capable of being transformed witha modified 1917, 2092, or 7724 gene, and can be used with anyheterologous DNA sequence of interest. Expression of the heterologousDNA sequence and the modified gene can be driven by the same promoterfunctional in plant cells, or by separate promoters.

[0137] VI. Plant Transformation Technology

[0138] A wild-type or herbicide-tolerant form of the 1917, 2092, or 7724gene, or homologs thereof, can be incorporated in plant or bacterialcells using conventional recombinant DNA technology. Generally, thisinvolves inserting a DNA molecule encoding the 1917, 2092, or 7724 geneinto an expression system to which the DNA molecule is heterologous(i.e., not normally present) using standard cloning procedures known inthe art. The vector contains the necessary elements for thetranscription and translation of the inserted protein-coding sequencesin a host cell containing the vector. A large number of vector systemsknown in the art can be used, such as plasmids, bacteriophage virusesand other modified viruses. The components of the expression system mayalso be modified to increase expression. For example, truncatedsequences, nucleotide substitutions, nucleotide optimization or othermodifications may be employed. Expression systems known in the art canbe used to transform virtually any crop plant cell under suitableconditions. A heterologous DNA sequence comprising a wild-type orherbicide-tolerant form of the 1917, 2092, or 7724 gene is preferablystably transformed and integrated into the genome of the host cells. Inanother preferred embodiment, the heterologous DNA sequence comprising awild-type or herbicide-tolerant form of the 1917, 2092, or 7724 genelocated on a self-replicating vector. Examples of self-replicatingvectors are viruses, in particular gemini viruses. Transformed cells canbe regenerated into whole plants such that the chosen form of the 1917,2092, or 7724 gene confers herbicide tolerance in the transgenic plants.

[0139] A. Requirements for Construction of Plant Expression Cassettes

[0140] Gene sequences intended for expression in transgenic plants isfirst assembled in expression cassettes behind a suitable promoterexpressible in plants. The expression cassettes may also comprise anyfurther sequences required or selected for the expression of theheterologous DNA sequence. Such sequences include, but are notrestricted to, transcription terminators, extraneous sequences toenhance expression such as introns, vital sequences, and sequencesintended for the targeting of the gene product to specific organellesand cell compartments. These expression cassettes can then be easilytransferred to the plant transformation vectors described infra. Thefollowing is a description of various components of typical expressioncassettes.

[0141] 1. Promoters

[0142] The selection of the promoter used in expression cassettes willdetermine the spatial and temporal expression pattern of theheterologous DNA sequence in the plant transformed with this DNAsequence. Selected promoters will express heterologous DNA sequences inspecific cell types (such as leaf epidermal cells, mesophyll cells, rootcortex cells) or in specific tissues or organs (roots, leaves orflowers, for example) and the selection will reflect the desiredlocation of accumulation of the gene product. Alternatively, theselected promoter may drive expression of the gene under variousinducing conditions. Promoters vary in their strength, i.e., ability topromote transcription. Depending upon the host cell system utilized, anyone of a number of suitable promoters known in the art can be used. Forexample, for constitutive expression, the CaMV 35S promoter, the riceactin promoter, or the ubiquitin promoter may be used. For regulatableexpression, the chemically inducible PR-1 promoter from tobacco orArabidopsis may be used (see, e.g., U.S. Pat. No. 5,689,044).

[0143] 2. Transcriptional Terminators

[0144] A variety of transcriptional terminators are available for use inexpression cassettes. These are responsible for the termination oftranscription beyond the heterologous DNA sequence and its correctpolyadenylation. Appropriate transcriptional terminators are those thatare known to function in plants and include the CaMV 35S terminator, thetml terminator, the nopaline synthase terminator and the pea rbcS E9terminator. These can be used in both monocotyledonous anddicotyledonous plants.

[0145] 3. Sequences for the Enhancement or Regulation of Expression

[0146] Numerous sequences have been found to enhance gene expressionfrom within the transcriptional unit and these sequences can be used inconjunction with the genes of this invention to increase theirexpression in transgenic plants. For example, various intron sequencessuch as introns of the maize AdhI gene have been shown to enhanceexpression, particularly in monocotyledonous cells. In addition, anumber of non-translated leader sequences derived from viruses are alsoknown to enhance expression, and these are particularly effective indicotyledonous cells.

[0147] 4. Coding Sequence Optimization

[0148] The coding sequence of the selected gene may be geneticallyengineered by altering the coding sequence for optimal expression in thecrop species of interest. Methods for modifying coding sequences toachieve optimal expression in a particular crop species are well known(see, e.g. Perlak et al., Proc. Natl. Acad. Sci. USA 88: 3324 (1991);and Koziel et al., Bio/technol. 11: 194 (1993)).

[0149] 5. Targeting of the Gene Product Within the Cell

[0150] Various mechanisms for targeting gene products are known to existin plants and the sequences controlling the functioning of thesemechanisms have been characterized in some detail. For example, thetargeting of gene products to the chloroplast is controlled by a signalsequence found at the amino terminal end of various proteins which iscleaved during chloroplast import to yield the mature protein (e.g.Comai et al. J. Biol. Chem. 263: 15104-15109 (1988)). Other geneproducts are localized to other organelles such as the mitochondrion andthe peroxisome (e.g. Unger et al. Plant Molec. Biol. 13: 411-418(1989)). The cDNAs encoding these products can also be manipulated toeffect the targeting of heterologous products encoded by DNA sequencesto these organelles. In addition, sequences have been characterizedwhich cause the targeting of products encoded by DNA sequences to othercell compartments. Amino terminal sequences are responsible fortargeting to the ER, the apoplast, and extracellular secretion fromaleurone cells (Koehler & Ho, Plant Cell 2: 769-783 (1990)).Additionally, amino terminal sequences in conjunction with carboxyterminal sequences are responsible for vacuolar targeting of geneproducts (Shinshi et al. Plant Molec. Biol. 14: 357-368 (1990)). By thefusion of the appropriate targeting sequences described above toheterologous DNA sequences of interest it is possible to direct thisproduct to any organelle or cell compartment.

[0151] B. Construction of Plant Transformation Vectors

[0152] Numerous transformation vectors available for planttransformation are known to those of ordinary skill in the planttransformation arts, and the genes pertinent to this invention can beused in conjunction with any such vectors. The selection of vector willdepend upon the preferred transformation technique and the targetspecies for transformation. For certain target species, differentantibiotic or herbicide selection markers may be preferred. Selectionmarkers used routinely in transformation include the nptII gene, whichconfers resistance to kanamycin and related antibiotics (Messing &Vierra. Gene 19: 259-268 (1982); Bevan et al., Nature 304:184-187(1983)), the bar gene, which confers resistance to the herbicidephosphinothricin (White et al., Nucl. Acids Res 18: 1062 (1990), Spenceret al. Theor. Appl. Genet 79: 625-631 (1990)), the hph gene, whichconfers resistance to the antibiotic hygromycin (Blochinger &Diggelmann, Mol Cell Biol 4: 2929-2931), the manA gene, which allows forpositive selection in the presence of mannose (Miles and Guest (1984)Gene, 32:41-48; U.S. Pat. No. 5,767,378), and the dhfr gene, whichconfers resistance to methotrexate (Bourouis et al., EMBO J. 2(7):1099-1104 (1983)), and the EPSPS gene, which confers resistance toglyphosate (U.S. Pat. Nos. 4,940,935 and 5,188,642).

[0153] 1. Vectors Suitable for Agrobacterium Transformation

[0154] Many vectors are available for transformation using Agrobacteriumtumefaciens. These typically carry at least one T-DNA border sequenceand include vectors such as pBIN19 (Bevan, Nucl. Acids Res. (1984)).Typical vectors suitable for Agrobacterium transformation include thebinary vectors pCIB200 and pCIB2001, as well as the binary vector pCIB10and hygromycin selection derivatives thereof. (See, for example, U.S.Pat. No. 5,639,949).

[0155] 2. Vectors Suitable for non-Agrobacterium Transformation

[0156] Transformation without the use of Agrobacterium tumefacienscircumvents the requirement for T-DNA sequences in the chosentransformation vector and consequently vectors lacking these sequencescan be utilized in addition to vectors such as the ones described abovewhich contain T-DNA sequences. Transformation techniques that do notrely on Agrobacterium include transformation via particle bombardment,protoplast uptake (e.g. PEG and electroporation) and microinjection. Thechoice of vector depends largely on the preferred selection for thespecies being transformed. Typical vectors suitable fornon-Agrobacterium transformation include pCIB3064, pSOG19, and pSOG35.(See, for example, U.S. Pat. No. 5,639,949).

[0157] C. Transformation Techniques

[0158] Once the coding sequence of interest has been cloned into anexpression system, it is transformed into a plant cell. Methods fortransformation and regeneration of plants are well known in the art. Forexample, Ti plasmid vectors have been utilized for the delivery offoreign DNA, as well as direct DNA uptake, liposomes, electroporation,micro-injection, and microprojectiles. In addition, bacteria from thegenus Agrobacterium can be utilized to transform plant cells.

[0159] Transformation techniques for dicotyledons are well known in theart and include Agrobacterium-based techniques and techniques that donot require Agrobacterium. Non-Agrobacterium techniques involve theuptake of exogenous genetic material directly by protoplasts or cells.This can be accomplished by PEG- or electroporation-mediated uptake,particle bombardment-mediated delivery, or microinjection. In each casethe transformed cells are regenerated to whole plants using standardtechniques known in the art.

[0160] Transformation of most monocotyledon species has now also becomeroutine. Preferred techniques include direct gene transfer intoprotoplasts using PEG or electroporation techniques, particlebombardment into callus tissue, as well as Agrobacterium-mediatedtransformation.

[0161] D. Plastid Transformation

[0162] In another preferred embodiment, a nucleotide sequence encoding apolypeptide having 1917, 2092, or 7724 activity is directly transformedinto the plastid genome. Plastid expression, in which genes are insertedby homologous recombination into the several thousand copies of thecircular plastid genome present in each plant cell, takes advantage ofthe enormous copy number advantage over nuclear-expressed genes topermit expression levels that can readily exceed 10% of the totalsoluble plant protein. In a preferred embodiment, the nucleotidesequence is inserted into a plastid-targeting vector and transformedinto the plastid genome of a desired plant host. Plants homoplasmic forplastid genomes containing the nucleotide sequence are obtained, and arepreferentially capable of high expression of the nucleotide sequence.

[0163] Plastid transformation technology is for example extensivelydescribed in U.S. Pat. Nos. 5,451,513, 5,545,817, 5,545,818, and5,877,462 in PCT application no. WO 95/16783 and WO 97/32977, and inMcBride et al. (1994) Proc. Natl. Acad. Sci. USA 91, 7301-7305, allincorporated herein by reference in their entirety. The basic techniquefor plastid transformation involves introducing regions of clonedplastid DNA flanking a selectable marker together with the nucleotidesequence into a suitable target tissue, e.g., using biolistics orprotoplast transformation (e.g., calcium chloride or PEG mediatedtransformation). The 1 to 1.5 kb flanking regions, termed targetingsequences, facilitate homologous recombination with the plastid genomeand thus allow the replacement or modification of specific regions ofthe plastome. Initially, point mutations in the chloroplast 16S rRNA andrpsl2 genes conferring resistance to spectinomycin and/or streptomycinare utilized as selectable markers for transformation (Svab, Z.,Hajdukiewicz, P., and Maliga, P. (1990) Proc. Natl. Acad. Sci. USA 87,8526-8530; Staub, J. M., and Maliga, P. (1992) Plant Cell 4, 39-45). Thepresence of cloning sites between these markers allowed creation of aplastid targeting vector for introduction of foreign genes (Staub, J.M., and Maliga, P. (1993) EMBO J. 12, 601-606). Substantial increases intransformation frequency are obtained by replacement of the recessiverRNA or r-protein antibiotic resistance genes with a dominant selectablemarker, the bacterial aadA gene encoding the spectinomycin-detoxifyingenzyme aminoglycoside-3′-adenyltransferase (Svab, Z., and Maliga, P.(1993) Proc. Natl. Acad. Sci. USA 90, 913-917). Other selectable markersuseful for plastid transformation are known in the art and encompassedwithin the scope of the invention.

[0164] VII. Breeding

[0165] The wild-type or altered form of a 1917, 2092, or 7724 gene ofthe present invention can be utilized to confer herbicide tolerance to awide variety of plant cells, including those of gymnosperms, monocots,and dicots. Although the gene can be inserted into any plant cellfalling within these broad classes, it is particularly useful in cropplant cells, such as rice, wheat, barley, rye, corn, potato, carrot,sweet potato, sugar beet, bean, pea, chicory, lettuce, cabbage,cauliflower, broccoli, turnip, radish, spinach, asparagus, onion,garlic, eggplant, pepper, celery, carrot, squash, pumpkin, zucchini,cucumber, apple, pear, quince, melon, plum, cherry, peach, nectarine,apricot, strawberry, grape, raspberry, blackberry, pineapple, avocado,papaya, mango, banana, soybean, tobacco, tomato, sorghum and sugarcane.

[0166] The high-level expression of a wild-type 1917, 2092, or 7724 geneand/or the expression of herbicide-tolerant forms of a 1917, 2092, or7724 gene conferring herbicide tolerance in plants, in combination withother characteristics important for production and quality, can beincorporated into plant lines through breeding approaches and techniquesknown in the art.

[0167] Where a herbicide tolerant 1917, 2092, or 7724 gene allele isobtained by direct selection in a crop plant or plant cell culture fromwhich a crop plant can be regenerated, it is moved into commercialvarieties using traditional breeding techniques to develop a herbicidetolerant crop without the need for genetically engineering the alleleand transforming it into the plant.

[0168] The invention will be further described by reference to thefollowing detailed examples. These examples are provided for purposes ofillustration only, and are not intended to be limiting unless otherwisespecified.

EXAMPLES

[0169] Standard recombinant DNA and molecular cloning techniques usedhere are well known in the art and are described by J. Sambrook, et al.,Molecular Cloning: A Laboratory Manual, 3d Ed., Cold Spring Harbor,N.Y.: Cold Spring Harbor Laboratory Press (2001); by T. J. Silhavy, M.L. Berman, and L. W. Enquist, Experiments with Gene Fusions, Cold SpringHarbor Laboratory, Cold Spring Harbor, N.Y. (1984) and by Ausubel, F. M.et al., Current Protocols in Molecular Biology, New York, John Wiley andSons Inc., (1988), Reiter, et al., Methods in Arabidopsis Research,World Scientific Press (1992), Schultz et al., Plant Molecular BiologyManual, Kluwer Academic Publishers (1998), and Reiter, et al., Methodsin Arabidopsis Research, World Scientific Press (1992). These referencesdescribe the standard techniques used for all steps in tagging andcloning genes from T-DNA mutagenized populations of Arabidopsis: plantinfection and transformation; screening for the identification ofseedling mutants; cosegregation analysis; and plasmid rescue.

Example 1

[0170] Plant Infection and Transformation in Tagged Embryo-Lethal Lines1917, 2092, and 7724

[0171] Arabidopsis plants (strain Columbia) are inverted, and theirleaves are vacuum-infiltrated with Agrobacterium (1X dilution ofAgrobacterium grown to OD600 of 0.8 in 10 MM MgCl₂). T1 seed iscollected from these plants, and germinated on an agar-solidified mediumcontaining (50 ug/ml Basta) or sprayed in soil (400 μg/ml Basta).Typically, 0.1% to 1.0% of the plants contain T-DNA inserts in apopulation of T1 transformants. Furthermore, the plants that survive onBasta selection are hemizygous for the T-DNA insertion and thus theBasta selectable marker.

[0172] Mutants blocked in growth or development are identified byexamining T2 progeny using an embryo screen and recovering those plantsthat contained 25% aborted seeds. Using segregation analysis of T2individuals, approximately one-third of the mutants are tagged.

Example 2

[0173] Embryo Screen for the Identification of Mutants Blocked in EarlyDevelopment from Tagged Embryo-Lethal Lines 1917, 2092, and 7724

[0174] Essential genes are identified through the isolation of lethalmutants blocked in early development. Examples of lethal mutants includethose blocked in the formation of the male or female gametes, embryo, orresulting seedling. Gametophytic mutants are found by examining T1insertion lines for the presence of 50% aborted pollen grains or ovules.Embryo defective lethal mutants produce 25% defective seeds followingself-pollination of T1 plants (see Errampalli et al. 1991, Plant Cell3:149-157; Castle et al. 1993, Mol Gen Genet 241:504-514). Seedlinglethal mutants segregate for 25% seedlings that exhibit a lethalphenotype.

[0175] The T1 line #1917 shows 25% defective seeds that contain embryosthat are arrested at the globular stage of development.

[0176] The T1 line #2092 shows 25% defective seeds that contain embryosthat are arrested at the preglobular to globular stages of development.

[0177] The T1 line #7724 shows 25% defective seeds that contain embryosthat are arrested at the torpedo to cotyledon stage of development.

Example 3

[0178] Cosegregation Analysis for Tagged Embryo-Lethal Lines 1917, 2092,and 7724

[0179] The linkage of the mutation to the T-DNA insert is establishedafter identifying a transformed line segregating for a lethal phenotypeof interest. A line segregating with a single functional insert willsegregate for resistance in the ratio of 2:1 (resistance:sensitive) tothe selectable marker Basta. In this case, one-quarter of the T2 progenywill fail to germinate due to embryo lethality, resulting in a reductionof the normal 3:1 ratio to 2:1. Each of the Basta resistant progeny aretherefore heterozygous for the mutation if the T-DNA insert is causingthe mutant phenotype. To confirm cosegregation of the T-DNA and themutant phenotype, Basta resistant progeny are transplanted to soil andscreened again for the presence of 25% aborted seeds.

[0180] For 1917, each of the 18 progeny examined contains approximately25% aborted seeds with the expected phenotype. These results confirmthat there is no evidence for recombination between the T-DNA and themutation. Single plant southern blot analysis suggests that the T-DNAinsertion in line #1917 consists of a simple insertion.

[0181] For 2092, each of the 35 progeny examined contains approximately25% aborted seeds with the expected phenotype. These results confirmthat there is no evidence for recombination between the T-DNA and themutation. Single plant Southern blot analysis suggests that theinsertion in line #2092 consists of a at least three tandem T-DNAelements. Cosegregation analysis shows that hygromycin resistance andthe mutant phenotype in line 2092 exhibit complete linkage in 35 selfedprogeny from a selfed heterozygote.

[0182] For 7724, each of the 37 progeny examined contains approximately25% aborted seeds with the expected phenotype. These results confirmthat there is no evidence for recombination between the T-DNA and themutation. Cosegregation analysis shows that Basta resistance and themutant phenotype in line 7724 exhibit complete linkage in 37 selfedprogeny from a selfed heterozygote.

Example 4a

[0183] Plasmid Rescue from Tagged Embryo-Lethal Line 1917

[0184] Arabidopsis genomic DNA is isolated as described Reiter et al inMethods in Arabidopsis Research, World Scientific Press (1992). GenomicDNA is digested with a restriction endonuclease and ligated overnight.After ligation, the DNA is transformed into competent E. coli strainXL-1 Blue, DH10B, DH5 alpha, or the like, and colonies are selected onsemi-solid medium containing ampicillin. Resistant colonies are pickedinto liquid medium with ampicillin and grown overnight. Plasmid DNA isisolated and digested with the rescue enzyme and analyzed on agarosegels containing ethidium bromide for visualization. Plasmids thatrepresent different size classes are sequenced using primers that flankthe plant DNA portion of the rescue element and the sequence is analyzedto determine what portion is plant DNA and what gene has been disrupted.

[0185] One method of confirming that the disrupted gene is the cause ofthe mutant phenotype is to transform a wild-type form of the gene intothe mutant plant. Alternatively, the mutant is phenocopied byspecifically reducing expression of the disrupted gene in transgenicplants expressing an antisense version of the gene behind a syntheticpromoter (Guyer et al. (1998) Genetics, 149: 633-639).

Example 4b

[0186] Plasmid Rescue from Tagged Embryo-Lethal Line 2092

[0187] Arabidopsis genomic DNA is isolated as described in Reiter et alin Methods in Arabidopsis Research, World Scientific Press (1992).Genomic DNA is digested with a restriction endonuclease and ligatedovernight. After ligation, the DNA is transformed into competent E. colistrain XL-1 Blue, DH10B, DH5 alpha, or the like, and colonies areselected on semi-solid medium containing ampicillin. Resistant coloniesare picked into liquid medium with ampicillin and grown overnight.Plasmid DNA is isolated and digested with the rescue enzyme and analyzedon agarose gels containing ethidium bromide for visualization. Plasmidsthat represent different size classes are sequenced using primers thatflank the plant DNA portion of the rescue element and the sequence isanalyzed to determine what portion is plant DNA and what gene has beendisrupted.

[0188] One method of confirming that the disrupted gene is the cause ofthe mutant phenotype is to transform a wild-type form of the gene intothe mutant plant. Alternatively, the mutant is phenocopied byspecifically reducing expression of the disrupted gene in transgenicplants expressing an antisense version of the gene behind a syntheticpromoter (Guyer et al. (1998) Genetics, 149: 633-639).

Example 4c

[0189] Border Rescue from Tagged Embryo-Lethal Line 7724

[0190] Arabidopsis genomic DNA is isolated as described in Reiter et alin Methods in Arabidopsis Research, World Scientific Press (1992). DNAflanking the borders of line #7724 is isolated using TAIL PCR. A seriesof 12 TAIL PCR reactions are performed on DNA from line #7724; 6arbitrary degenerate primers (CA50 primer: 5′ NGT CGA SWG ANA WGA A 3′:SEQ ID NO:9 (128-fold, AD2 from Liu et al. (1995) The Plant Journal, 8:457-463); CA51 primer: 5′ TGW GNA GSA NCA SAG A 3′: SEQ ID NO:10(128-fold derivative of AD1 from Liu and Whittier (1995) Genomics, 25:674-681); CA52 primer: 5′ AGW GNA GWA NCA WAG G 3′: SEQ ID NO:11(128-fold, AD2 from Liu and Whittier (1995) Genomics, 25:674-681); CA53primer: 5′ STT GNT AST NCT NTG C 3′: SEQ ID NO:12 (256-fold, AD5 fromTsugeki et al. (1996) The Plant Journal, 10: 479-489); CA54 primer: 5′NTC GAS TWT SGW GTT 3′: SEQ ID NO:13 (64-fold, ADI from Liu et al.(1995) The Plant Journal, 8: 457-463); and CA55 primer: 5′ WGT GNA GWANCA NAG A 3′: SEQ ID NO:14 (256-fold, AD3 from Liu et al. (1995) ThePlant Journal, 8: 457-463) are used in combination with two sets ofnested, and T-DNA specific primers for the right border (CA66 primer: 5′ATT AGG CAC CCC AGG CTT TAC ACT TTA TG 3′: SEQ ID NO:15 (pCSA104 rightborder primary primer); CA67 primer: 5′ GTA TGT TGT GTG GAA TTG TGA GCGGAT AAC 3′: SEQ ID NO:16 (pCSA104 right border secondary primer); andCA68 primer: 5′ TAA CAA TTT CAC ACA GGA AAC AGC TAT GAC 3′: SEQ ID NO:17(pCSA104 right border tertiary primer) as well as for the left border(JM33 primer: 5′ TAG CAT CTG AAT TTC ATA ACC AAT CTC GAT ACA C 3′: SEQID NO:18 (pCSA104 left border tertiary primer; JM34 primer: 5′ GCT TCCTAT TAT ATC TTC CCA AAT TAC CAA TAC A 3′: SEQ ID NO:19 (pCSA104 leftborder secondary primer); and JM35 primer:

[0191] 5′ GCC TTT TCA GAA ATG GAT AAA TAG CCT TGC TTC C 3′: SEQ ID NO:20(pCSA104 left border primary primer) of the T-DNA region of pCSA104.

[0192] A total of 10 products are obtained from the left border, two ofthe sequenced products represent both sides of the T-DNA insertion. PCRprimers specific to the genomic region are then designed and used toconfirm the border products obtained by TAIL PCR.

Example 5a

[0193] Sequence Analysis of Tagged Embryo-Lethal Line #1917 From theInsertional Mutant Collection

[0194] Analysis of Arabidopsis thaliana genomic DNA sequence flankingthe right border region of the T-DNA insert in line 1917 reveals asingle exon open reading frame of 1,656 bp (SEQ ID NO:1). Arabidopsisthaliana genomic DNA flanking the T-DNA border is identical to the ESTs166E6T7 (Genbank Accession #R30603) and 203E14T7 (Genbank Accession #H77096) and to portions of the genomic survey sequences T19C17TR(Genbank Accession # B28763) F13K23-Sp6 (Genbank Accession # B 10372).

[0195] Using GAP (SeqWeb version 10.0, GCG), pairwise comparisons of theprotein sequence (SEQ ID NO:2) and input sequences shown below give ameasure of similarity between SEQ ID NO:2 and the indicated sequences,and they are summarized below. GenPept Accession # % Identity %Similarity P37880¹ 47 63 NP_002878.1² 46 63 Q55486³ 48 62 Q19825⁴ 43 60AE001641⁵ 42 57 AL079345⁶ 40 57 P43832⁷ 40 56 P11875⁸ 40 58 NP_010628.1⁹30 49 AL031853¹⁰ 31 43

Example 5b

[0196] Sequence Analysis of Tagged Embryo-Lethal Line #2092 From theInsertional Mutant Collection

[0197] Analysis of Arabidopsis thaliana genomic DNA sequence flankingthe right border of the T-DNA insert in line 2092 shows that the T-DNAhas inserted into a region of the genome represented by P1 clone MRN17(GenBank accession AB005243). Further analysis of the insertion siteshows that this region contains a gene with sequence identity to genesencoding an alanyl tRNA synthetase.

[0198] Using GAP (SeqWeb version 10.0, GCG), pairwise comparisons of theprotein sequence (SEQ ID NO:4) and input sequences shown below give ameasure of similarity between SEQ ID NO:4 and the indicated sequences,and they are summarized below. Genbank Accession # % Identity %Similarity G2500959¹ 57.6 67.3 AE000353² 47.3 55.3 NP_014980³ 38.3 48.9AF188718⁴ 36.9 46.3 AB033096⁵ 34.2 42.4

Example 5c

[0199] Sequence Analysis of Tagged Embryo-Lethal Line #7724 From theInsertional Mutant Collection

[0200] The sequence of both TAIL PCR border products matches thesequence from the BAC clone F4L23 (Accession AC002387). Further analysisof these products reveals a 20 base pair deletion that occurred uponT-DNA insertion in line #7724, corresponding to base number 60,450through 60,469, of BAC clone F4L23. Analysis of the DNA sequence fromthe recovered borders reveals homology to 2′-phosphotransferase genes.Further inspection of recovered border fragments reveals that the T-DNAhas inserted in the middle of the coding region for a gene that encodesa protein with greater than 30% identity 2′-phosphotransferase-likegenes from microorganisms listed below.

[0201] Using GAP (SeqWeb version 10.0, GCG), pairwise comparisons of theprotein sequence (SEQ ID NO:6) and input sequences shown below give ameasure of similarity between SEQ ID NO:6 and the indicated sequence;and are summarized below. Genbank Accession # % Identity NP_014539¹ 35.8CAA22225² 33.5 CAB16372³ 33.5 BAA29229⁴ 32.4 AAB90829⁵ 30.7

Example 6a

[0202] Isolation and Identification of 1917 cDNA Coding Region

[0203] The isolation and characterization of a cDNA clone correspondingto the Arabidopsis thaliana gene encoding arginyl-tRNA synthetase isdisclosed in Genbank accession # Z98760.

Example 6b

[0204] Isolation and Identification of 2092 CDNA Coding Region

[0205] The full length cDNA for gene 2092 was isolated using theMarathon cDNA amplification kit (CLONETECH). Primers JM99(5′-ACTTCACTGCCTTCAGAAACCCTTATCACAG-3′: SEQ ID NO:27) and API (part ofCLONETECH kit) are used in the first round of amplification on cDNAtemplate generated from 14-day old Arabidopsis seedlings. Then, JM100(5′-CTTATCACAGGCTTCCCATTCACCAAAAGAC-3′: SEQ ID NO:28) and AP2(Clonetech) are used in nested PCR reactions to generate the finalfull-length sequence. Nine independent products are TA cloned,sequenced, and assembled into a single contig using the full sequence ofclone 18709 from the Arabidopsis EST project.

Example 6c

[0206] Isolation and Identification of 7724 cDNA Coding Region

[0207] Sequence analysis if EST sequences derived from clone 10409showed that it contained the entire coding region. The two EST sequencesderived from the 5′ and 3′ ends of clone 10409 do not overlap.Additional sequencing reactions were performed to complete determinationof the sequence of the entire clone. Analysis of the final sequenceshowed a 2937 bp ORF that encodes the entire deduced protein.

Example 7a

[0208] Expression of Recombinant 1917 Protein in Heterologous ExpressionSystems

[0209] The coding region of the protein, corresponding to the CDNA cloneSEQ ID NO:1, is subcloned into previously described expression vectors,and transformed into E. coli using the manufacturer's conditions.Specific examples include plasmids such as pBluescript (Stratagene, LaJolla, Calif.), the pET vector system (Novagen, Inc., Madison, Wis.)pFLAG (International Biotechnologies, Inc., New Haven, Conn.), andpTrcHis (Invitrogen, La Jolla, Calif.). E. coli is cultured, andexpression of the 1917 activity is confirmed. Alternatively, eukaryoticexpression systems such as cultured insect cells infected with specificviruses may be preferred. Examples of vectors and insect cell lines aredescribed previously. Protein conferring 1917 activity is isolated usingstandard techniques.

Example 7b

[0210] Expression of Recombinant 2092 Protein in Heterologous ExpressionSystems

[0211] The coding region of the protein, corresponding to the cDNA cloneSEQ ID NO:3, is subcloned into previously described expression vectors,and transformed into E. coli using the manufacturer's conditions.Specific examples include plasmids such as pBluescript (Stratagene, LaJolla, Calif.), the pET vector system (Novagen, Inc., Madison, Wis.)pFLAG (International Biotechnologies, Inc., New Haven, Conn.), andpTrcHis (Invitrogen, La Jolla, Calif.). E. coli is cultured, andexpression of the 2092 activity is confirmed. Alternatively, eukaryoticexpression systems such as cultured insect cells infected with specificviruses may be preferred. Examples of vectors and insect cell lines aredescribed previously. Protein conferring 2092 activity is isolated usingstandard techniques.

Example 7c

[0212] Expression of Recombinant 7724 Protein in Heterologous ExpressionSystems

[0213] The coding region of the protein, corresponding to the cDNA cloneSEQ ID NO:5, is subcloned into previously described expression vectors,and transformed into E. coli using the manufacturer's conditions.Specific examples include plasmids such as pBluescript (Stratagene, LaJolla, Calif.), the pET vector system (Novagen, Inc., Madison, Wis.)pFLAG (International Biotechnologies, Inc., New Haven, Conn.), andpTrcHis (Invitrogen, La Jolla, Calif.). E. coli is cultured, andexpression of the 7724 activity is confirmed. Alternatively, eukaryoticexpression systems such as cultured insect cells infected with specificviruses may be preferred. Examples of vectors and insect cell lines aredescribed previously. Protein conferring 7724 activity is isolated usingstandard techniques.

Example 8a

[0214] In vitro Recombination of 1917 Genes by DNA Shuffling

[0215] The nucleotide sequence shown in SEQ ID NO:1 is amplified by PCR.The resulting DNA fragment is digested by DNaseI treatment essentiallyas described (Stemmer et al. (1994) PNAS 91: 10747-10751) and the PCRprimers are removed from the reaction mixture. A PCR reaction is carriedout without primers and is followed by a PCR reaction with the primers,both as described (Stemmer et al. (1994) PNAS 91: 10747-10751). Theresulting DNA fragments are cloned into pTRC99a (Pharmacia, Cat no:27-5007-01) for use in bacteria, or into pESC vectors (StratageneCatalog) for use in yeast; and transformed into a bacterial or yeaststrain deficient in 1917 activity by electroporation using the BioradGene Pulser and the manufacturer's conditions. The transformed bacteriaor yeast are grown on medium that contains inhibitory concentrations ofan inhibitor of 1917 activity and those colonies that grow in thepresence of the inhibitor are selected. Colonies that grow in thepresence of normally inhibitory concentrations of inhibitor are pickedand purified by repeated restreaking. Their plasmids are purified andthe DNA sequences of cDNA inserts from plasmids that pass this test arethen determined.

[0216] In a similar reaction, PCR-amplified DNA fragments comprising theA. thaliana 1917 gene encoding the protein and PCR-amplified DNAfragments comprising the 1917 gene from E. coli are recombined in vitroand resulting variants with improved tolerance to the inhibitor arerecovered as described above.

Example 8b

[0217] In vitro Recombination of 2092 Genes by DNA Shuffling

[0218] The nucleotide sequence shown in SEQ ID NO:3 is amplified by PCR.The resulting DNA fragment is digested by DNase I treatment essentiallyas described (Stemmer et al. (1994) PNAS 91: 10747-10751) and the PCRprimers are removed from the reaction mixture. A PCR reaction is carriedout without primers and is followed by a PCR reaction with the primers,both as described (Stemmer et al. (1994) PNAS 91: 10747-10751). Theresulting DNA fragments are cloned into pTRC99a (Pharmacia, Cat no:27-5007-01) for use in bacteria, or into pESC vectors (StratageneCatalog) for use in yeast; and transformed into a bacterial or yeaststrain deficient in 2092 activity by electroporation using the BioradGene Pulser and the manufacturer's conditions. The transformed bacteriaor yeast are grown on medium that contains inhibitory concentrations ofan inhibitor of 2092 activity and those colonies that grow in thepresence of the inhibitor are selected. Colonies that grow in thepresence of normally inhibitory concentrations of inhibitor are pickedand purified by repeated restreaking. Their plasmids are purified andthe DNA sequences of cDNA inserts from plasmids that pass this test arethen determined.

[0219] In a similar reaction, PCR-amplified DNA fragments comprising theA. thaliana 2092 gene encoding the protein and PCR-amplified DNAfragments comprising the 2092 gene from E. coli are recombined in vitroand resulting variants with improved tolerance to the inhibitor arerecovered as described above.

Example 8c

[0220] In vitro Recombination of 7724 Genes by DNA Shuffling

[0221] The nucleotide sequence shown in SEQ ID NO:5 is amplified by PCR.The resulting DNA fragment is digested by DNase I treatment essentiallyas described (Stemmer et al. (1994) PNAS 91: 10747-10751) and the PCRprimers are removed from the reaction mixture. A PCR reaction is carriedout without primers and is followed by a PCR reaction with the primers,both as described (Stemmer et al. (1994) PNAS 91: 10747-10751). Theresulting DNA fragments are cloned into pTRC99a (Pharmacia, Cat no:27-5007-01) for use in bacteria, or into pESC vectors (StratageneCatalog) for use in yeast; and transformed into a bacterial or yeaststrain deficient in 7724 activity by electroporation using the BioradGene Pulser and the manufacturer's conditions. The transformed bacteriaor yeast are grown on medium that contains inhibitory concentrations ofan inhibitor of 7724 activity and those colonies that grow in thepresence of the inhibitor are selected. Colonies that grow in thepresence of normally inhibitory concentrations of inhibitor are pickedand purified by repeated restreaking. Their plasmids are purified andthe DNA sequences of cDNA inserts from plasmids that pass this test arethen determined.

[0222] In a similar reaction, PCR-amplified DNA fragments comprising theA. thaliana 7724 gene encoding the protein and PCR-amplified DNAfragments comprising the 7724 gene from E. coli are recombined in vitroand resulting variants with improved tolerance to the inhibitor arerecovered as described above.

Example 9a

[0223] In vitro Recombination of 1917 Genes by Staggered ExtensionProcess

[0224] The Arabidopsis thaliana 1917 gene encoding the 1917 protein andthe E. coli 1917 homologous gene are each cloned into the polylinker ofa pBluescript vector. A PCR reaction is carried out essentially asdescribed (Zhao et al. (1998) Nature Biotechnology 16: 258-261) usingthe “reverse primer” and the “M13-20 primer” (Stratagene Catalog).Amplified PCR fragments are digested with appropriate restrictionenzymes and cloned into pTRC99a and mutated 1917 genes are screened asdescribed in Example 8a.

Example 9b

[0225] In vitro Recombination of 2092 Genes by Staggered ExtensionProcess

[0226] The Arabidopsis thaliana 2092 gene encoding the 2092 protein andthe E. coli 2092 homologous gene are each cloned into the polylinker ofa pBluescript vector. A PCR reaction is carried out essentially asdescribed (Zhao et al. (1998) Nature Biotechnology 16: 258-261) usingthe “reverse primer” and the “M13-20 primer” (Stratagene Catalog).Amplified PCR fragments are digested with appropriate restrictionenzymes and cloned into pTRC99a and mutated 2092 genes are screened asdescribed in Example 8b.

Example 9c

[0227] In vitro Recombination of 7724 Genes by Staggered ExtensionProcess

[0228] The Arabidopsis thaliana 7724 gene encoding the 7724 protein andthe E. coli 7724 homologous gene are each cloned into the polylinker ofa pBluescript vector. A PCR reaction is carried out essentially asdescribed (Zhao et al. (1998) Nature Biotechnology 16: 258-261) usingthe “reverse primer” and the “M13-20 primer” (Stratagene Catalog).Amplified PCR fragments are digested with appropriate restrictionenzymes and cloned into pTRC99a and mutated 7724 genes are screened asdescribed in Example 8c.

Example 10

[0229] In vitro Binding Assays

[0230] Recombinant 1917, 2092, or 7724 protein is obtained, for example,according to Example 7a, 7b, or 7c, respectively. The protein isimmobilized on chips appropriate for ligand binding assays usingtechniques that are well known in the art. The protein immobilized onthe chip is exposed to sample compound in solution according to methodswell know in the art. While the sample compound is in contact with theimmobilized protein measurements capable of detecting protein-ligandinteractions are conducted. Examples of such measurements are SELDI,biacore and FCS, described above. Compounds found to bind the proteinare readily discovered in this fashion and are subjected to furthercharacterization.

Example 11

[0231] Plastid Transformation

[0232] Transformation Vectors

[0233] For expression of a nucleotide sequence encoding a polypeptidehaving 1917, 2092, or 7724 activity encoding in plant plastids, plastidtransformation vector pPH143 or pPH145 (WO 97/32011) is used; and thisreference is incorporated herein by reference. The nucleotide sequenceis inserted into pPH143 thereby replacing the PROTOX coding sequence.This vector is then used for plastid transformation and selection oftransformants for spectinomycin resistance. Alternatively, thenucleotide sequence is inserted in pPH143 so that it replaces the aadHgene. In this case, transformants are selected for resistance to PROTOXinhibitors.

[0234] Plastid Transformation

[0235] Seeds of Nicotiana tabacum c.v. ‘Xanthi nc’ are germinated sevenper plate in a 1″ circular array on T agar medium and bombarded 12-14days after sowing with 1 μm tungsten particles (M10, Biorad, Hercules,Calif.) coated with DNA from plasmids pPH143 and pPH145 essentially asdescribed (Svab, Z. and Maliga, P. (1993) Proc. Natl. Acad. Sci. USA 90,913-917). Bombarded seedlings are incubated on T medium for two daysafter which leaves are excised and placed abaxial side up in brightlight (350-500 μmol photons/m²/s) on plates of RMOP medium (Svab, Z.,Hajdukiewicz, P. and Maliga, P. (1990) Proc. Natl. Acad. Sci. USA 87,8526-8530) containing 500 μg/ml spectinomycin dihydrochloride (Sigma,St. Louis, Mo.). Resistant shoots appearing underneath the bleachedleaves three to eight weeks after bombardment are subcloned onto thesame selective medium, allowed to form callus, and secondary shootsisolated and subcloned. Complete segregation of transformed plastidgenome copies (homoplasmicity) in independent subclones is assessed bystandard techniques of Southern blotting (Sambrook et al., (1989)Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory,Cold Spring Harbor). Homoplasmic shoots are rooted aseptically onspectinomycin-containing MS/IBA medium (McBride, K. E. et al. (1994)Proc. Natl. Acad. Sci. USA 91, 7301-7305) and transferred to thegreenhouse.

Example 12a

[0236] In vitro assay for Arginyl tRNA Synthetase

[0237] The arginyl tRNA synthetase activity assay is derived from Popeet al. (1998) J. Biol. Chem. 273, 31691-31701 and references citedtherein. The reaction volumes are preferably the ones described below,but can be varied depending on the experimental requirements. The assaycan be performed using 0.2-5 nM, but preferably 1 nM, of an enzymehaving arginyl tRNA synthetase activity, 0.1-10 μM, but preferably 1 μM,L-[U-¹⁴C] arginine, and 0.1-10 μM, but preferably 1 μM, of tRNA^(Arg)are mixed in a final volume of 50 μL 50 mM Tris-HCl (pH 7.0-9.0, butpreferably 7.9), 1-20 mM, but preferably, 10 mM MgCl₂, 1-100 mM, butpreferably 50 mM KCl, and 0.1-20 mM, but preferably 2 mM dithiothreitol.After a time interval, 100 μL of 7% trichloroacetic acid and incubatedon ice for 10 minutes. Trichloroacetic acid-precipitate material can beharvested using 0.45 mm polyvinylidene difluoride multiwell plates andcounted by scintillation.

Example 12b

[0238] In vitro assay for Alanyl tRNA Synthetase

[0239] The alanyl tRNA synthetase activity assay is derived from Pope etal. (1998) J. Biol. Chem. 273, 31691-31701 and references cited therein.The reaction volumes are preferably the ones described below, but can bevaried depending on the experimental requirements. The assay can beperformed using 0.2-5 nM, but preferably 1 nM, of an enzyme havingalanyl tRNA synthetase activity, 0.1-10 μM, but preferably 1 μM,L-[U-¹⁴C] alanine, and 0.1-10 μM, but preferably 1 μM, of tRNA^(Ala) aremixed in a final volume of 50 μL 50 mM Tris-HCl (pH 7.0-9.0, butpreferably 7.9), 1-20 mM, but preferably, 10 mM MgCl₂, 1-100 mM, butpreferably 50 mM KCl, and 0.1-20 mM, but preferably 2 mM dithiothreitol.After a time interval, 100 μL of 7% trichloroacetic acid and incubatedon ice for 10 minutes. Trichloroacetic acid-precipitable material can beharvested using 0.45 mm polyvinylidene difluoride multiwell plates andcounted by scintillation.

Example 12c

[0240] In vitro assay for 2′-Phosphotransferase

[0241] Many eukaryotes, including the yeast Saccharomyces cerevisiae,humans, and plants contain tRNA gene families whose members containintervening sequences (Culbertson, M. R. and M. Winey (1989) Yeast 5:405-427). Joining of the tRNA exons involves a ligase that generates amature sized tRNA bearing a splice junction 2′-phosphate (Greer et al(1983) Cell 32: 537-546). The removal of the splice junction2′-phosphate is catalyzed by a 2′phosphotransferase that transfers thesplice junction phosphate to NAD, forming ADP-ribose 1′-2′ cyclicphosphate (Culver et al (1993) Science 261: 206-208).

[0242] An assay for the 2′phosphotransferase may be performed in which aligated tRNA with a ³³P- or ³²P-labeled splice junction 2′-phosphate isprepared by in vitro endonucleolytic cleavage and ligation of an (α-³³P)or (α-³²P) ATP-labeled pre-tRNA transcript (McCraith et al (1991) J.Biol. Chem. 266: 11986-11992). The labeled pre-tRNA transcript can bederived by in vitro transcription of a plasmid-borne copy of theend-matured pre-tRNA gene (Reyes et al (1987) Anal. Biochem. 166:90-106). Alternatively, the pre-tRNA may be synthesized by chemicalcoupling of the ribonucleic acid building blocks using anoligonucleotide synthesizer. The ligated tRNA with a labeled splicejunction may be attached to a scintillant-coated solid support such as abead, e.g., an SPA bead (Amersham Pharmacia), or a microtiter platesurface, e.g., the Flash Plate (NEN), by covalent attachment or throughligand-ligand interaction, such as biotin-avidin. The radiation givenoff by the surface-bound, labeled pre-tRNA collides with thescintillator molecules on the solid support. The energy is convertedinto photons that are measured and quantified by appropriatelight-measuring instrumentation. A reaction mixture consisting of anenzyme having 2′-phosphotransferase activity and NAD in a bufferappropriate for the activity of the 2′-phosphotransferase is added to amicrotiter plate containing the surface-bound, labeled pre-tRNA. Theaction of the enzyme will result in the release of the radioisotope fromthe surface-bound pre-tRNA and, therefore, a decrease in signal.Aspiration and washing steps may be required to eliminate interferencefrom unbound radiolabel.

[0243] Alternatively, an oligonucleotide complementary to the labeledpre-tRNA transcript is attached to a solid support. In this case, areaction mixture consisting of 2′-phosphotransferase, NAD, and unbound,labeled pre-tRNA are incubated for an appropriate period of time andthen added to the plate containing the bound, complementaryoligonucleotide. The pre-tRNA anneals to the complementaryoligonucleotide. The signal arising from any radiolabel remaining on thepre-tRNA is quantified as described above. Aspiration and washing stepsmay be required to eliminate interference from unbound radiolabel.

[0244] The above-disclosed embodiments are illustrative. This disclosureof the invention will place one skilled in the art in possession of manyvariations of the invention. All such obvious and foreseeable variationsare intended to be encompassed by the appended claims.

1 27 1 1773 DNA Arabidopsis thaliana CDS (1)..(1773) 1 atg gca gct aatgaa gaa ttt acg gga aat ctg aaa cgt caa ctc gcg 48 Met Ala Ala Asn GluGlu Phe Thr Gly Asn Leu Lys Arg Gln Leu Ala 1 5 10 15 aag ctc ttt gatgtt tct cta aaa tta acg gtt cct gat gaa cct agt 96 Lys Leu Phe Asp ValSer Leu Lys Leu Thr Val Pro Asp Glu Pro Ser 20 25 30 gtt gag ccc ttg gtggct gcc tcc gct ctt gga aaa ttt gga gat tac 144 Val Glu Pro Leu Val AlaAla Ser Ala Leu Gly Lys Phe Gly Asp Tyr 35 40 45 caa tgt aac aac gca atggga cta tgg tcc ata att aaa gga aag ggt 192 Gln Cys Asn Asn Ala Met GlyLeu Trp Ser Ile Ile Lys Gly Lys Gly 50 55 60 act cag ttc aag ggt cct ccagct gtt gga cag gcc ctt gtt aag agt 240 Thr Gln Phe Lys Gly Pro Pro AlaVal Gly Gln Ala Leu Val Lys Ser 65 70 75 80 ctc cct act tct gag atg gtagaa tca tgc tct gta gct gga cct ggc 288 Leu Pro Thr Ser Glu Met Val GluSer Cys Ser Val Ala Gly Pro Gly 85 90 95 ttt att aat gtt gta cta tca gctaag tgg atg gct aag agt att gaa 336 Phe Ile Asn Val Val Leu Ser Ala LysTrp Met Ala Lys Ser Ile Glu 100 105 110 aat atg ctc atc gat gga gtt gacaca tgg gca cct act ctt tcg gtt 384 Asn Met Leu Ile Asp Gly Val Asp ThrTrp Ala Pro Thr Leu Ser Val 115 120 125 aag aga gct gta gtt gat ttt tcctct ccc aac att gca aaa gaa atg 432 Lys Arg Ala Val Val Asp Phe Ser SerPro Asn Ile Ala Lys Glu Met 130 135 140 cat gtt ggt cat cta aga tca actatc att ggt gac act cta gct cgc 480 His Val Gly His Leu Arg Ser Thr IleIle Gly Asp Thr Leu Ala Arg 145 150 155 160 atg ctc gag tac tca cat gttgaa gtt cta cgc aga aac cat gtt ggt 528 Met Leu Glu Tyr Ser His Val GluVal Leu Arg Arg Asn His Val Gly 165 170 175 gac tgg gga aca cag ttt ggcatg cta att gag tac ctc ttt gag aaa 576 Asp Trp Gly Thr Gln Phe Gly MetLeu Ile Glu Tyr Leu Phe Glu Lys 180 185 190 ttt cct gat aca gat agt gtgacc gag aca gca att gga gat ctt cag 624 Phe Pro Asp Thr Asp Ser Val ThrGlu Thr Ala Ile Gly Asp Leu Gln 195 200 205 gtg ttt tac aag gca tca aaacat aaa ttt gat ctg gac gag gcc ttt 672 Val Phe Tyr Lys Ala Ser Lys HisLys Phe Asp Leu Asp Glu Ala Phe 210 215 220 aag gaa aaa gca caa cag gctgtg gtc cgt cta cag ggt ggt gat cct 720 Lys Glu Lys Ala Gln Gln Ala ValVal Arg Leu Gln Gly Gly Asp Pro 225 230 235 240 gtt tac cgt aag gct tgggct aag atc tgt gac atc agc cga act gag 768 Val Tyr Arg Lys Ala Trp AlaLys Ile Cys Asp Ile Ser Arg Thr Glu 245 250 255 ttt gcc aag gtt tac caacgc ctt cga gtt gag ctt gaa gaa aag gga 816 Phe Ala Lys Val Tyr Gln ArgLeu Arg Val Glu Leu Glu Glu Lys Gly 260 265 270 gaa agc ttt tac aac cctcat att gct aaa gta att gag gaa ttg aat 864 Glu Ser Phe Tyr Asn Pro HisIle Ala Lys Val Ile Glu Glu Leu Asn 275 280 285 agc aag ggg ttg gtt gaagaa agt gaa ggt gct cgt gtg att ttc ctt 912 Ser Lys Gly Leu Val Glu GluSer Glu Gly Ala Arg Val Ile Phe Leu 290 295 300 gaa ggc ttc gac atc ccactc atg gtt gta aag agt gat ggt ggt ttt 960 Glu Gly Phe Asp Ile Pro LeuMet Val Val Lys Ser Asp Gly Gly Phe 305 310 315 320 aac tat gcc tca acagat ctg act gct ctt tgg tac cgg ctc aat gaa 1008 Asn Tyr Ala Ser Thr AspLeu Thr Ala Leu Trp Tyr Arg Leu Asn Glu 325 330 335 gag aaa gct gag tggatc ata tat gtg acc gat gtt ggc cag cag cag 1056 Glu Lys Ala Glu Trp IleIle Tyr Val Thr Asp Val Gly Gln Gln Gln 340 345 350 cac ttt aat atg ttcttc aaa gct gcc aga aaa gca ggt tgg ctt cca 1104 His Phe Asn Met Phe PheLys Ala Ala Arg Lys Ala Gly Trp Leu Pro 355 360 365 gac aat gat aaa acttac cct aga gtt aac cat gtt ggt ttt ggt ctc 1152 Asp Asn Asp Lys Thr TyrPro Arg Val Asn His Val Gly Phe Gly Leu 370 375 380 gtc ctt ggg gaa gatggc aag cga ttt aga act cgg gca aca gat gta 1200 Val Leu Gly Glu Asp GlyLys Arg Phe Arg Thr Arg Ala Thr Asp Val 385 390 395 400 gtc cgc cta gttgat ttg cta gat gag gcc aag act cgc agt aaa ctt 1248 Val Arg Leu Val AspLeu Leu Asp Glu Ala Lys Thr Arg Ser Lys Leu 405 410 415 gcc ctt att gagcgc ggt aag gac aaa gaa tgg aca ccg gaa gaa ctg 1296 Ala Leu Ile Glu ArgGly Lys Asp Lys Glu Trp Thr Pro Glu Glu Leu 420 425 430 gac caa aca gctgag gca gtt gga tat ggt gcg gtc aag tat gct gac 1344 Asp Gln Thr Ala GluAla Val Gly Tyr Gly Ala Val Lys Tyr Ala Asp 435 440 445 ctg aag aac aacaga tta aca aat tat act ttc agc ttt gat caa atg 1392 Leu Lys Asn Asn ArgLeu Thr Asn Tyr Thr Phe Ser Phe Asp Gln Met 450 455 460 ctt aat gac aaggga aat aca gcc gtt tac ctt ctt tac gcc cat gct 1440 Leu Asn Asp Lys GlyAsn Thr Ala Val Tyr Leu Leu Tyr Ala His Ala 465 470 475 480 cgg atc tgttca atc atc aga aag tct ggc aaa gac ata gat gag ctg 1488 Arg Ile Cys SerIle Ile Arg Lys Ser Gly Lys Asp Ile Asp Glu Leu 485 490 495 aaa aag acagga aaa tta gca ttg gat cat gca gat gaa cga gca ctg 1536 Lys Lys Thr GlyLys Leu Ala Leu Asp His Ala Asp Glu Arg Ala Leu 500 505 510 ggg ctt cacttg ctt cga ttt gct gag acg gtg gag gaa gct tgt acc 1584 Gly Leu His LeuLeu Arg Phe Ala Glu Thr Val Glu Glu Ala Cys Thr 515 520 525 aac tta ttaccg agt gtt ctg tgc gag tac ctc tac aat tta tct gaa 1632 Asn Leu Leu ProSer Val Leu Cys Glu Tyr Leu Tyr Asn Leu Ser Glu 530 535 540 cac ttt accaga ttc tac tcc aat tgt cag gtc aat ggt tca cca gag 1680 His Phe Thr ArgPhe Tyr Ser Asn Cys Gln Val Asn Gly Ser Pro Glu 545 550 555 560 gag acaagc cgt ctc cta ctt tgt gaa gca acg gcc ata gtc atg cgg 1728 Glu Thr SerArg Leu Leu Leu Cys Glu Ala Thr Ala Ile Val Met Arg 565 570 575 aaa tgcttc cac ctt ctt gga atc act ccg gtt tac aag att tga 1773 Lys Cys Phe HisLeu Leu Gly Ile Thr Pro Val Tyr Lys Ile 580 585 590 2 590 PRTArabidopsis thaliana 2 Met Ala Ala Asn Glu Glu Phe Thr Gly Asn Leu LysArg Gln Leu Ala 1 5 10 15 Lys Leu Phe Asp Val Ser Leu Lys Leu Thr ValPro Asp Glu Pro Ser 20 25 30 Val Glu Pro Leu Val Ala Ala Ser Ala Leu GlyLys Phe Gly Asp Tyr 35 40 45 Gln Cys Asn Asn Ala Met Gly Leu Trp Ser IleIle Lys Gly Lys Gly 50 55 60 Thr Gln Phe Lys Gly Pro Pro Ala Val Gly GlnAla Leu Val Lys Ser 65 70 75 80 Leu Pro Thr Ser Glu Met Val Glu Ser CysSer Val Ala Gly Pro Gly 85 90 95 Phe Ile Asn Val Val Leu Ser Ala Lys TrpMet Ala Lys Ser Ile Glu 100 105 110 Asn Met Leu Ile Asp Gly Val Asp ThrTrp Ala Pro Thr Leu Ser Val 115 120 125 Lys Arg Ala Val Val Asp Phe SerSer Pro Asn Ile Ala Lys Glu Met 130 135 140 His Val Gly His Leu Arg SerThr Ile Ile Gly Asp Thr Leu Ala Arg 145 150 155 160 Met Leu Glu Tyr SerHis Val Glu Val Leu Arg Arg Asn His Val Gly 165 170 175 Asp Trp Gly ThrGln Phe Gly Met Leu Ile Glu Tyr Leu Phe Glu Lys 180 185 190 Phe Pro AspThr Asp Ser Val Thr Glu Thr Ala Ile Gly Asp Leu Gln 195 200 205 Val PheTyr Lys Ala Ser Lys His Lys Phe Asp Leu Asp Glu Ala Phe 210 215 220 LysGlu Lys Ala Gln Gln Ala Val Val Arg Leu Gln Gly Gly Asp Pro 225 230 235240 Val Tyr Arg Lys Ala Trp Ala Lys Ile Cys Asp Ile Ser Arg Thr Glu 245250 255 Phe Ala Lys Val Tyr Gln Arg Leu Arg Val Glu Leu Glu Glu Lys Gly260 265 270 Glu Ser Phe Tyr Asn Pro His Ile Ala Lys Val Ile Glu Glu LeuAsn 275 280 285 Ser Lys Gly Leu Val Glu Glu Ser Glu Gly Ala Arg Val IlePhe Leu 290 295 300 Glu Gly Phe Asp Ile Pro Leu Met Val Val Lys Ser AspGly Gly Phe 305 310 315 320 Asn Tyr Ala Ser Thr Asp Leu Thr Ala Leu TrpTyr Arg Leu Asn Glu 325 330 335 Glu Lys Ala Glu Trp Ile Ile Tyr Val ThrAsp Val Gly Gln Gln Gln 340 345 350 His Phe Asn Met Phe Phe Lys Ala AlaArg Lys Ala Gly Trp Leu Pro 355 360 365 Asp Asn Asp Lys Thr Tyr Pro ArgVal Asn His Val Gly Phe Gly Leu 370 375 380 Val Leu Gly Glu Asp Gly LysArg Phe Arg Thr Arg Ala Thr Asp Val 385 390 395 400 Val Arg Leu Val AspLeu Leu Asp Glu Ala Lys Thr Arg Ser Lys Leu 405 410 415 Ala Leu Ile GluArg Gly Lys Asp Lys Glu Trp Thr Pro Glu Glu Leu 420 425 430 Asp Gln ThrAla Glu Ala Val Gly Tyr Gly Ala Val Lys Tyr Ala Asp 435 440 445 Leu LysAsn Asn Arg Leu Thr Asn Tyr Thr Phe Ser Phe Asp Gln Met 450 455 460 LeuAsn Asp Lys Gly Asn Thr Ala Val Tyr Leu Leu Tyr Ala His Ala 465 470 475480 Arg Ile Cys Ser Ile Ile Arg Lys Ser Gly Lys Asp Ile Asp Glu Leu 485490 495 Lys Lys Thr Gly Lys Leu Ala Leu Asp His Ala Asp Glu Arg Ala Leu500 505 510 Gly Leu His Leu Leu Arg Phe Ala Glu Thr Val Glu Glu Ala CysThr 515 520 525 Asn Leu Leu Pro Ser Val Leu Cys Glu Tyr Leu Tyr Asn LeuSer Glu 530 535 540 His Phe Thr Arg Phe Tyr Ser Asn Cys Gln Val Asn GlySer Pro Glu 545 550 555 560 Glu Thr Ser Arg Leu Leu Leu Cys Glu Ala ThrAla Ile Val Met Arg 565 570 575 Lys Cys Phe His Leu Leu Gly Ile Thr ProVal Tyr Lys Ile 580 585 590 3 2937 DNA Arabidopsis thaliana CDS(1)..(2937) 3 atg aat ttc tcc aga gta aac ctc ttc gat ttt cct ctt agacca att 48 Met Asn Phe Ser Arg Val Asn Leu Phe Asp Phe Pro Leu Arg ProIle 1 5 10 15 ttg ctt tcg cat cct tct tct att ttc gtt tct aca cgt tttgtt acc 96 Leu Leu Ser His Pro Ser Ser Ile Phe Val Ser Thr Arg Phe ValThr 20 25 30 aga acc tct gca ggt gtt tct cct tct atc tta ctt ccc aga tcaact 144 Arg Thr Ser Ala Gly Val Ser Pro Ser Ile Leu Leu Pro Arg Ser Thr35 40 45 cag tct cct cag att att gct aag agc tca tca gta tca gta cag cca192 Gln Ser Pro Gln Ile Ile Ala Lys Ser Ser Ser Val Ser Val Gln Pro 5055 60 gtg tct gag gat gct aag gag gat tat cag tcc aaa gat gtt agt gga240 Val Ser Glu Asp Ala Lys Glu Asp Tyr Gln Ser Lys Asp Val Ser Gly 6570 75 80 gat tca ata cgg cgg cgt ttt ctt gaa ttc ttt gct tct cgt ggt cat288 Asp Ser Ile Arg Arg Arg Phe Leu Glu Phe Phe Ala Ser Arg Gly His 8590 95 aag gtg ctt cca agt tcg tct ctt gta cca gaa gat cct acc gtc ttg336 Lys Val Leu Pro Ser Ser Ser Leu Val Pro Glu Asp Pro Thr Val Leu 100105 110 cta aca att gca gga atg ctt cag ttt aag cct att ttc ctt gga aag384 Leu Thr Ile Ala Gly Met Leu Gln Phe Lys Pro Ile Phe Leu Gly Lys 115120 125 gta cct aga gag gtt cct tgt gca acc act gcg caa agg tgt ata cgt432 Val Pro Arg Glu Val Pro Cys Ala Thr Thr Ala Gln Arg Cys Ile Arg 130135 140 acg aat gat ttg gag aat gtt ggg aaa acg gct agg cac cat act ttc480 Thr Asn Asp Leu Glu Asn Val Gly Lys Thr Ala Arg His His Thr Phe 145150 155 160 ttt gag atg ctt ggg aac ttt agc ttt ggt gat tac ttc aag aaagaa 528 Phe Glu Met Leu Gly Asn Phe Ser Phe Gly Asp Tyr Phe Lys Lys Glu165 170 175 gcg ata aaa tgg gca tgg gag ctt tca act att gag ttt ggg ctacca 576 Ala Ile Lys Trp Ala Trp Glu Leu Ser Thr Ile Glu Phe Gly Leu Pro180 185 190 gct aat aga gtt tgg gtt agt ata tat gaa gac gat gat gaa gctttt 624 Ala Asn Arg Val Trp Val Ser Ile Tyr Glu Asp Asp Asp Glu Ala Phe195 200 205 gaa atc tgg aag aat gaa gtt ggt gtt tct gtt gag cgg ata aagaga 672 Glu Ile Trp Lys Asn Glu Val Gly Val Ser Val Glu Arg Ile Lys Arg210 215 220 atg ggt gaa gct gac aac ttt tgg act agt gga cca act ggt ccttgt 720 Met Gly Glu Ala Asp Asn Phe Trp Thr Ser Gly Pro Thr Gly Pro Cys225 230 235 240 ggt cca tgc tct gag ttg tac tat gac ttc tat cct gag agaggt tat 768 Gly Pro Cys Ser Glu Leu Tyr Tyr Asp Phe Tyr Pro Glu Arg GlyTyr 245 250 255 gat gaa gat gtt gat ctt ggg gat gat acc aga ttt att gagttc tat 816 Asp Glu Asp Val Asp Leu Gly Asp Asp Thr Arg Phe Ile Glu PheTyr 260 265 270 aat ttg gtt ttc atg cag tat aac aag acg gaa gat gga ttgctt gag 864 Asn Leu Val Phe Met Gln Tyr Asn Lys Thr Glu Asp Gly Leu LeuGlu 275 280 285 ccc ttg aaa cag aag aat ata gat act ggt ctt ggt ttg gaacgt ata 912 Pro Leu Lys Gln Lys Asn Ile Asp Thr Gly Leu Gly Leu Glu ArgIle 290 295 300 gct caa atc ctt cag aag gtt cca aac aac tac gag aca gatttg ata 960 Ala Gln Ile Leu Gln Lys Val Pro Asn Asn Tyr Glu Thr Asp LeuIle 305 310 315 320 tat cca atc att gca aag atc tca gag ttg gcg aat atctca tat gac 1008 Tyr Pro Ile Ile Ala Lys Ile Ser Glu Leu Ala Asn Ile SerTyr Asp 325 330 335 tct gca aat gac aag gca aag aca agt tta aaa gtg attgca gat cac 1056 Ser Ala Asn Asp Lys Ala Lys Thr Ser Leu Lys Val Ile AlaAsp His 340 345 350 atg cgg gca gtt gtc tat ctc ata tca gat ggt gtt tctcct tca aat 1104 Met Arg Ala Val Val Tyr Leu Ile Ser Asp Gly Val Ser ProSer Asn 355 360 365 att ggc aga ggt tat gtg gtt agg agg cta ata aga agagca gtt cgg 1152 Ile Gly Arg Gly Tyr Val Val Arg Arg Leu Ile Arg Arg AlaVal Arg 370 375 380 aag ggg aag tct ctc gga ata aat ggg gat atg aat ggtaat cta aag 1200 Lys Gly Lys Ser Leu Gly Ile Asn Gly Asp Met Asn Gly AsnLeu Lys 385 390 395 400 gga gcg ttt ttg cca gcg gtt gct gaa aag gtg atagag ttg agc act 1248 Gly Ala Phe Leu Pro Ala Val Ala Glu Lys Val Ile GluLeu Ser Thr 405 410 415 tat att gat tca gat gta aaa cta aag gcc tca cgcatc att gag gag 1296 Tyr Ile Asp Ser Asp Val Lys Leu Lys Ala Ser Arg IleIle Glu Glu 420 425 430 att agg caa gaa gaa ctt cac ttt aag aaa act ctggaa aga gga gaa 1344 Ile Arg Gln Glu Glu Leu His Phe Lys Lys Thr Leu GluArg Gly Glu 435 440 445 aag tta ctt gac caa aag ctt aac gat gca ttg tcaatt gct gat aaa 1392 Lys Leu Leu Asp Gln Lys Leu Asn Asp Ala Leu Ser IleAla Asp Lys 450 455 460 act aag gat acg cct tat ctg gat gga aaa gat gcgttt ctt ctt tat 1440 Thr Lys Asp Thr Pro Tyr Leu Asp Gly Lys Asp Ala PheLeu Leu Tyr 465 470 475 480 gac aca ttt ggc ttt cct gtg gag ata act gcagaa gtt gct gaa gaa 1488 Asp Thr Phe Gly Phe Pro Val Glu Ile Thr Ala GluVal Ala Glu Glu 485 490 495 cgt gga gtc agt ata gat atg aat ggt ttt gaagtg gaa atg gag aat 1536 Arg Gly Val Ser Ile Asp Met Asn Gly Phe Glu ValGlu Met Glu Asn 500 505 510 caa aga cgt caa tct caa gct gct cac aat gttgta aaa ctg aca gtt 1584 Gln Arg Arg Gln Ser Gln Ala Ala His Asn Val ValLys Leu Thr Val 515 520 525 gaa gac gat gct gac atg acg aaa aat att gcagac act gag ttc ctt 1632 Glu Asp Asp Ala Asp Met Thr Lys Asn Ile Ala AspThr Glu Phe Leu 530 535 540 gga tat gac agt ctc tct gct cgt gct gtt gtgaaa agt ctt ttg gtg 1680 Gly Tyr Asp Ser Leu Ser Ala Arg Ala Val Val LysSer Leu Leu Val 545 550 555 560 aat ggg aag cct gtg ata agg gtt tct gaaggc agt gaa gta gag gtt 1728 Asn Gly Lys Pro Val Ile Arg Val Ser Glu GlySer Glu Val Glu Val 565 570 575 ctg ctg gac aga act ccg ttc tat gct gaatca gga ggt caa att gca 1776 Leu Leu Asp Arg Thr Pro Phe Tyr Ala Glu SerGly Gly Gln Ile Ala 580 585 590 gat cat ggt ttt ctt tat gtt agc agt gatggg aac caa gag aaa gct 1824 Asp His Gly Phe Leu Tyr Val Ser Ser Asp GlyAsn Gln Glu Lys Ala 595 600 605 gtt gtt gag gta agt gat gtg cag aag tctctt aaa att ttt gtt cac 1872 Val Val Glu Val Ser Asp Val Gln Lys Ser LeuLys Ile Phe Val His 610 615 620 aag ggc act gta aaa agt gga gct cta gaagtt ggc aag gag gtg gaa 1920 Lys Gly Thr Val Lys Ser Gly Ala Leu Glu ValGly Lys Glu Val Glu 625 630 635 640 gca gca gta gat gca gac ttg agg caacga gcg aag gtt cac cat acg 1968 Ala Ala Val Asp Ala Asp Leu Arg Gln ArgAla Lys Val His His Thr 645 650 655 gcc act cat ttg ctc caa tcg gca cttaaa aaa gta gta gga caa gaa 2016 Ala Thr His Leu Leu Gln Ser Ala Leu LysLys Val Val Gly Gln Glu 660 665 670 aca tca cag gct ggt tca tta gta gctttt gac cgc ctc aga ttc gat 2064 Thr Ser Gln Ala Gly Ser Leu Val Ala PheAsp Arg Leu Arg Phe Asp 675 680 685 ttc aat ttt aat cgg tcc ctg cat gataat gag ctt gag gaa atc gaa 2112 Phe Asn Phe Asn Arg Ser Leu His Asp AsnGlu Leu Glu Glu Ile Glu 690 695 700 tgc ctg atc aat agg tgg att ggg gatgct aca cgt ctt gaa aca aaa 2160 Cys Leu Ile Asn Arg Trp Ile Gly Asp AlaThr Arg Leu Glu Thr Lys 705 710 715 720 gtc ctt cct ctt gct gat gca aaacgt gct gga gcc atc gca atg ttt 2208 Val Leu Pro Leu Ala Asp Ala Lys ArgAla Gly Ala Ile Ala Met Phe 725 730 735 ggg gaa aaa tat gat gaa aac gaggtt cgt gta gta gaa gtt cct ggt 2256 Gly Glu Lys Tyr Asp Glu Asn Glu ValArg Val Val Glu Val Pro Gly 740 745 750 gtc tcc atg gaa ctt tgt ggt ggcact cat gtt ggc aat act gca gaa 2304 Val Ser Met Glu Leu Cys Gly Gly ThrHis Val Gly Asn Thr Ala Glu 755 760 765 ata cga gcc ttc aag att atc tcagaa cag ggc att gca tct gga atc 2352 Ile Arg Ala Phe Lys Ile Ile Ser GluGln Gly Ile Ala Ser Gly Ile 770 775 780 cgg cgt ata gaa gcg gtt gca ggtgaa gca ttc att gaa tac ata aac 2400 Arg Arg Ile Glu Ala Val Ala Gly GluAla Phe Ile Glu Tyr Ile Asn 785 790 795 800 tca cgg gat tct caa atg acacgt cta tgc tcg act ctc aag gtg aaa 2448 Ser Arg Asp Ser Gln Met Thr ArgLeu Cys Ser Thr Leu Lys Val Lys 805 810 815 gca gag gat gtt aca aac agagtg gag aat ctt cta gag gaa cta cgt 2496 Ala Glu Asp Val Thr Asn Arg ValGlu Asn Leu Leu Glu Glu Leu Arg 820 825 830 gct gct aga aaa gaa gcc tccgac ttg cgt tca aaa gca gct gtc tat 2544 Ala Ala Arg Lys Glu Ala Ser AspLeu Arg Ser Lys Ala Ala Val Tyr 835 840 845 aaa gca tct gtc ata tcg aacaaa gca ttt act gta gga act tca cag 2592 Lys Ala Ser Val Ile Ser Asn LysAla Phe Thr Val Gly Thr Ser Gln 850 855 860 act ata aga gtg ctc gtt gagtcg atg gat gac acc gat gct gac tca 2640 Thr Ile Arg Val Leu Val Glu SerMet Asp Asp Thr Asp Ala Asp Ser 865 870 875 880 tta aag agt gca gct gagcat ttg ata agc aca ttg gaa gat cca gtc 2688 Leu Lys Ser Ala Ala Glu HisLeu Ile Ser Thr Leu Glu Asp Pro Val 885 890 895 gct gtg gta cta gga tcatct cca gaa aaa gac aag gtt agt tta gtt 2736 Ala Val Val Leu Gly Ser SerPro Glu Lys Asp Lys Val Ser Leu Val 900 905 910 gct gca ttt agt cct ggagta gtc tcc cta ggt gtt caa gca ggg aaa 2784 Ala Ala Phe Ser Pro Gly ValVal Ser Leu Gly Val Gln Ala Gly Lys 915 920 925 ttc att ggc ccc ata gctaag ctg tgt ggc gga gga ggt ggt gga aag 2832 Phe Ile Gly Pro Ile Ala LysLeu Cys Gly Gly Gly Gly Gly Gly Lys 930 935 940 ccc aat ttt gct cag gcaggc ggc aga aag cct gaa aat ctc cca agt 2880 Pro Asn Phe Ala Gln Ala GlyGly Arg Lys Pro Glu Asn Leu Pro Ser 945 950 955 960 gcc tta gag aaa gctcgg gaa gat ctc gtg gca act cta ttc gaa aag 2928 Ala Leu Glu Lys Ala ArgGlu Asp Leu Val Ala Thr Leu Phe Glu Lys 965 970 975 cta ggg tga 2937 LeuGly 4 978 PRT Arabidopsis thaliana 4 Met Asn Phe Ser Arg Val Asn Leu PheAsp Phe Pro Leu Arg Pro Ile 1 5 10 15 Leu Leu Ser His Pro Ser Ser IlePhe Val Ser Thr Arg Phe Val Thr 20 25 30 Arg Thr Ser Ala Gly Val Ser ProSer Ile Leu Leu Pro Arg Ser Thr 35 40 45 Gln Ser Pro Gln Ile Ile Ala LysSer Ser Ser Val Ser Val Gln Pro 50 55 60 Val Ser Glu Asp Ala Lys Glu AspTyr Gln Ser Lys Asp Val Ser Gly 65 70 75 80 Asp Ser Ile Arg Arg Arg PheLeu Glu Phe Phe Ala Ser Arg Gly His 85 90 95 Lys Val Leu Pro Ser Ser SerLeu Val Pro Glu Asp Pro Thr Val Leu 100 105 110 Leu Thr Ile Ala Gly MetLeu Gln Phe Lys Pro Ile Phe Leu Gly Lys 115 120 125 Val Pro Arg Glu ValPro Cys Ala Thr Thr Ala Gln Arg Cys Ile Arg 130 135 140 Thr Asn Asp LeuGlu Asn Val Gly Lys Thr Ala Arg His His Thr Phe 145 150 155 160 Phe GluMet Leu Gly Asn Phe Ser Phe Gly Asp Tyr Phe Lys Lys Glu 165 170 175 AlaIle Lys Trp Ala Trp Glu Leu Ser Thr Ile Glu Phe Gly Leu Pro 180 185 190Ala Asn Arg Val Trp Val Ser Ile Tyr Glu Asp Asp Asp Glu Ala Phe 195 200205 Glu Ile Trp Lys Asn Glu Val Gly Val Ser Val Glu Arg Ile Lys Arg 210215 220 Met Gly Glu Ala Asp Asn Phe Trp Thr Ser Gly Pro Thr Gly Pro Cys225 230 235 240 Gly Pro Cys Ser Glu Leu Tyr Tyr Asp Phe Tyr Pro Glu ArgGly Tyr 245 250 255 Asp Glu Asp Val Asp Leu Gly Asp Asp Thr Arg Phe IleGlu Phe Tyr 260 265 270 Asn Leu Val Phe Met Gln Tyr Asn Lys Thr Glu AspGly Leu Leu Glu 275 280 285 Pro Leu Lys Gln Lys Asn Ile Asp Thr Gly LeuGly Leu Glu Arg Ile 290 295 300 Ala Gln Ile Leu Gln Lys Val Pro Asn AsnTyr Glu Thr Asp Leu Ile 305 310 315 320 Tyr Pro Ile Ile Ala Lys Ile SerGlu Leu Ala Asn Ile Ser Tyr Asp 325 330 335 Ser Ala Asn Asp Lys Ala LysThr Ser Leu Lys Val Ile Ala Asp His 340 345 350 Met Arg Ala Val Val TyrLeu Ile Ser Asp Gly Val Ser Pro Ser Asn 355 360 365 Ile Gly Arg Gly TyrVal Val Arg Arg Leu Ile Arg Arg Ala Val Arg 370 375 380 Lys Gly Lys SerLeu Gly Ile Asn Gly Asp Met Asn Gly Asn Leu Lys 385 390 395 400 Gly AlaPhe Leu Pro Ala Val Ala Glu Lys Val Ile Glu Leu Ser Thr 405 410 415 TyrIle Asp Ser Asp Val Lys Leu Lys Ala Ser Arg Ile Ile Glu Glu 420 425 430Ile Arg Gln Glu Glu Leu His Phe Lys Lys Thr Leu Glu Arg Gly Glu 435 440445 Lys Leu Leu Asp Gln Lys Leu Asn Asp Ala Leu Ser Ile Ala Asp Lys 450455 460 Thr Lys Asp Thr Pro Tyr Leu Asp Gly Lys Asp Ala Phe Leu Leu Tyr465 470 475 480 Asp Thr Phe Gly Phe Pro Val Glu Ile Thr Ala Glu Val AlaGlu Glu 485 490 495 Arg Gly Val Ser Ile Asp Met Asn Gly Phe Glu Val GluMet Glu Asn 500 505 510 Gln Arg Arg Gln Ser Gln Ala Ala His Asn Val ValLys Leu Thr Val 515 520 525 Glu Asp Asp Ala Asp Met Thr Lys Asn Ile AlaAsp Thr Glu Phe Leu 530 535 540 Gly Tyr Asp Ser Leu Ser Ala Arg Ala ValVal Lys Ser Leu Leu Val 545 550 555 560 Asn Gly Lys Pro Val Ile Arg ValSer Glu Gly Ser Glu Val Glu Val 565 570 575 Leu Leu Asp Arg Thr Pro PheTyr Ala Glu Ser Gly Gly Gln Ile Ala 580 585 590 Asp His Gly Phe Leu TyrVal Ser Ser Asp Gly Asn Gln Glu Lys Ala 595 600 605 Val Val Glu Val SerAsp Val Gln Lys Ser Leu Lys Ile Phe Val His 610 615 620 Lys Gly Thr ValLys Ser Gly Ala Leu Glu Val Gly Lys Glu Val Glu 625 630 635 640 Ala AlaVal Asp Ala Asp Leu Arg Gln Arg Ala Lys Val His His Thr 645 650 655 AlaThr His Leu Leu Gln Ser Ala Leu Lys Lys Val Val Gly Gln Glu 660 665 670Thr Ser Gln Ala Gly Ser Leu Val Ala Phe Asp Arg Leu Arg Phe Asp 675 680685 Phe Asn Phe Asn Arg Ser Leu His Asp Asn Glu Leu Glu Glu Ile Glu 690695 700 Cys Leu Ile Asn Arg Trp Ile Gly Asp Ala Thr Arg Leu Glu Thr Lys705 710 715 720 Val Leu Pro Leu Ala Asp Ala Lys Arg Ala Gly Ala Ile AlaMet Phe 725 730 735 Gly Glu Lys Tyr Asp Glu Asn Glu Val Arg Val Val GluVal Pro Gly 740 745 750 Val Ser Met Glu Leu Cys Gly Gly Thr His Val GlyAsn Thr Ala Glu 755 760 765 Ile Arg Ala Phe Lys Ile Ile Ser Glu Gln GlyIle Ala Ser Gly Ile 770 775 780 Arg Arg Ile Glu Ala Val Ala Gly Glu AlaPhe Ile Glu Tyr Ile Asn 785 790 795 800 Ser Arg Asp Ser Gln Met Thr ArgLeu Cys Ser Thr Leu Lys Val Lys 805 810 815 Ala Glu Asp Val Thr Asn ArgVal Glu Asn Leu Leu Glu Glu Leu Arg 820 825 830 Ala Ala Arg Lys Glu AlaSer Asp Leu Arg Ser Lys Ala Ala Val Tyr 835 840 845 Lys Ala Ser Val IleSer Asn Lys Ala Phe Thr Val Gly Thr Ser Gln 850 855 860 Thr Ile Arg ValLeu Val Glu Ser Met Asp Asp Thr Asp Ala Asp Ser 865 870 875 880 Leu LysSer Ala Ala Glu His Leu Ile Ser Thr Leu Glu Asp Pro Val 885 890 895 AlaVal Val Leu Gly Ser Ser Pro Glu Lys Asp Lys Val Ser Leu Val 900 905 910Ala Ala Phe Ser Pro Gly Val Val Ser Leu Gly Val Gln Ala Gly Lys 915 920925 Phe Ile Gly Pro Ile Ala Lys Leu Cys Gly Gly Gly Gly Gly Gly Lys 930935 940 Pro Asn Phe Ala Gln Ala Gly Gly Arg Lys Pro Glu Asn Leu Pro Ser945 950 955 960 Ala Leu Glu Lys Ala Arg Glu Asp Leu Val Ala Thr Leu PheGlu Lys 965 970 975 Leu Gly 5 774 DNA Arabidopsis thaliana CDS(1)..(774) 5 atg gat gct tca aat ccc aat tct tct aga aaa tct aat gtc tcttcc 48 Met Asp Ala Ser Asn Pro Asn Ser Ser Arg Lys Ser Asn Val Ser Ser 15 10 15 ttc gct cag tcc agt cga agc ggt ggt aga gga gga gga tat gag aga96 Phe Ala Gln Ser Ser Arg Ser Gly Gly Arg Gly Gly Gly Tyr Glu Arg 20 2530 gat aac gat cga cgg aga cct cag ggt cgt ggc gac ggt gga ggc gga 144Asp Asn Asp Arg Arg Arg Pro Gln Gly Arg Gly Asp Gly Gly Gly Gly 35 40 45aag gat aga atc gat gca ctt gga cga ctc ttg acg aga ata ttg cga 192 LysAsp Arg Ile Asp Ala Leu Gly Arg Leu Leu Thr Arg Ile Leu Arg 50 55 60 catatg gct act gag ctg aga ttg aac atg aga ggt gat ggt ttt gtt 240 His MetAla Thr Glu Leu Arg Leu Asn Met Arg Gly Asp Gly Phe Val 65 70 75 80 aaagtt gaa gat tta ctt aac ctg aat ttg aaa act tct gca aat att 288 Lys ValGlu Asp Leu Leu Asn Leu Asn Leu Lys Thr Ser Ala Asn Ile 85 90 95 cag ttaaag tca cac acg att gat gaa att aga gag gct gtg aga agg 336 Gln Leu LysSer His Thr Ile Asp Glu Ile Arg Glu Ala Val Arg Arg 100 105 110 gac aataag caa cgg ttt agt ctc atc gat gag aat gga gag ctc ttg 384 Asp Asn LysGln Arg Phe Ser Leu Ile Asp Glu Asn Gly Glu Leu Leu 115 120 125 att cgcgct aac caa ggc cat tcg atc acg acg gtt gag tca gag aag 432 Ile Arg AlaAsn Gln Gly His Ser Ile Thr Thr Val Glu Ser Glu Lys 130 135 140 tta cttaaa cca ata ctg tca cca gaa gaa gct cca gtg tgt gta cat 480 Leu Leu LysPro Ile Leu Ser Pro Glu Glu Ala Pro Val Cys Val His 145 150 155 160 ggaact tat agg aag aat ttg gaa tcc atc tta gca tcg ggc tta aag 528 Gly ThrTyr Arg Lys Asn Leu Glu Ser Ile Leu Ala Ser Gly Leu Lys 165 170 175 cgtatg aat aga atg cat gtt cac ttc tct tgt gga tta cca aca gat 576 Arg MetAsn Arg Met His Val His Phe Ser Cys Gly Leu Pro Thr Asp 180 185 190 ggtgaa gtg att agt ggc atg aga aga aat gta aat gtt atc atc ttc 624 Gly GluVal Ile Ser Gly Met Arg Arg Asn Val Asn Val Ile Ile Phe 195 200 205 ctcgac atc aag aaa gct ctt gaa gat ggg att gcg ttc tac ata tca 672 Leu AspIle Lys Lys Ala Leu Glu Asp Gly Ile Ala Phe Tyr Ile Ser 210 215 220 gacaac aaa gtg att ttg act gaa ggc att gat ggt gta ttg cct gtc 720 Asp AsnLys Val Ile Leu Thr Glu Gly Ile Asp Gly Val Leu Pro Val 225 230 235 240gat tac ttc cag aag atc gag tct tgg cct gat cgg caa tcc ata cct 768 AspTyr Phe Gln Lys Ile Glu Ser Trp Pro Asp Arg Gln Ser Ile Pro 245 250 255ttc tga 774 Phe 6 257 PRT Arabidopsis thaliana 6 Met Asp Ala Ser Asn ProAsn Ser Ser Arg Lys Ser Asn Val Ser Ser 1 5 10 15 Phe Ala Gln Ser SerArg Ser Gly Gly Arg Gly Gly Gly Tyr Glu Arg 20 25 30 Asp Asn Asp Arg ArgArg Pro Gln Gly Arg Gly Asp Gly Gly Gly Gly 35 40 45 Lys Asp Arg Ile AspAla Leu Gly Arg Leu Leu Thr Arg Ile Leu Arg 50 55 60 His Met Ala Thr GluLeu Arg Leu Asn Met Arg Gly Asp Gly Phe Val 65 70 75 80 Lys Val Glu AspLeu Leu Asn Leu Asn Leu Lys Thr Ser Ala Asn Ile 85 90 95 Gln Leu Lys SerHis Thr Ile Asp Glu Ile Arg Glu Ala Val Arg Arg 100 105 110 Asp Asn LysGln Arg Phe Ser Leu Ile Asp Glu Asn Gly Glu Leu Leu 115 120 125 Ile ArgAla Asn Gln Gly His Ser Ile Thr Thr Val Glu Ser Glu Lys 130 135 140 LeuLeu Lys Pro Ile Leu Ser Pro Glu Glu Ala Pro Val Cys Val His 145 150 155160 Gly Thr Tyr Arg Lys Asn Leu Glu Ser Ile Leu Ala Ser Gly Leu Lys 165170 175 Arg Met Asn Arg Met His Val His Phe Ser Cys Gly Leu Pro Thr Asp180 185 190 Gly Glu Val Ile Ser Gly Met Arg Arg Asn Val Asn Val Ile IlePhe 195 200 205 Leu Asp Ile Lys Lys Ala Leu Glu Asp Gly Ile Ala Phe TyrIle Ser 210 215 220 Asp Asn Lys Val Ile Leu Thr Glu Gly Ile Asp Gly ValLeu Pro Val 225 230 235 240 Asp Tyr Phe Gln Lys Ile Glu Ser Trp Pro AspArg Gln Ser Ile Pro 245 250 255 Phe 7 3138 DNA Arabidopsis thaliana CDS(17)..(2953) 7 ctcctcatac tctctg atg aat ttc tcc aga gta aac ctc ttc gatttt cct 52 Met Asn Phe Ser Arg Val Asn Leu Phe Asp Phe Pro 1 5 10 cttaga cca att ttg ctt tcg cat cct tct tct att ttc gtt tct aca 100 Leu ArgPro Ile Leu Leu Ser His Pro Ser Ser Ile Phe Val Ser Thr 15 20 25 cgt tttgtt acc aga acc tct gca ggt gtt tct cct tct atc tta ctt 148 Arg Phe ValThr Arg Thr Ser Ala Gly Val Ser Pro Ser Ile Leu Leu 30 35 40 ccc aga tcaact cag tct cct cag att att gct aag agc tca tca gta 196 Pro Arg Ser ThrGln Ser Pro Gln Ile Ile Ala Lys Ser Ser Ser Val 45 50 55 60 tca gta cagcca gtg tct gag gat gct aag gag gat tat cag tcc aaa 244 Ser Val Gln ProVal Ser Glu Asp Ala Lys Glu Asp Tyr Gln Ser Lys 65 70 75 gat gtt agt ggagat tca ata cgg cgg cgt ttt ctt gaa ttc ttt gct 292 Asp Val Ser Gly AspSer Ile Arg Arg Arg Phe Leu Glu Phe Phe Ala 80 85 90 tct cgt ggt cat aaggtg ctt cca agt tcg tct ctt gta cca gaa gat 340 Ser Arg Gly His Lys ValLeu Pro Ser Ser Ser Leu Val Pro Glu Asp 95 100 105 cct acc gtc ttg ctaaca att gca gga atg ctt cag ttt aag cct att 388 Pro Thr Val Leu Leu ThrIle Ala Gly Met Leu Gln Phe Lys Pro Ile 110 115 120 ttc ctt gga aag gtacct aga gag gtt cct tgt gca acc act gcg caa 436 Phe Leu Gly Lys Val ProArg Glu Val Pro Cys Ala Thr Thr Ala Gln 125 130 135 140 agg tgt ata cgtacg aat gat ttg gag aat gtt ggg aaa acg gct agg 484 Arg Cys Ile Arg ThrAsn Asp Leu Glu Asn Val Gly Lys Thr Ala Arg 145 150 155 cac cat act ttcttt gag atg ctt ggg aac ttt agc ttt ggt gat tac 532 His His Thr Phe PheGlu Met Leu Gly Asn Phe Ser Phe Gly Asp Tyr 160 165 170 ttc aag aaa gaagcg ata aaa tgg gca tgg gag ctt tca act att gag 580 Phe Lys Lys Glu AlaIle Lys Trp Ala Trp Glu Leu Ser Thr Ile Glu 175 180 185 ttt ggg cta ccagct aat aga gtt tgg gtt agt ata tat gaa gac gat 628 Phe Gly Leu Pro AlaAsn Arg Val Trp Val Ser Ile Tyr Glu Asp Asp 190 195 200 gat gaa gct tttgaa atc tgg aag aat gaa gtt ggt gtt tct gtt gag 676 Asp Glu Ala Phe GluIle Trp Lys Asn Glu Val Gly Val Ser Val Glu 205 210 215 220 cgg ata aagaga atg ggt gaa gct gac aac ttt tgg act agt gga cca 724 Arg Ile Lys ArgMet Gly Glu Ala Asp Asn Phe Trp Thr Ser Gly Pro 225 230 235 act ggt ccttgt ggt cca tgc tct gag ttg tac tat gac ttc tat cct 772 Thr Gly Pro CysGly Pro Cys Ser Glu Leu Tyr Tyr Asp Phe Tyr Pro 240 245 250 gag aga ggttat gat gaa gat gtt gat ctt ggg gat gat acc aga ttt 820 Glu Arg Gly TyrAsp Glu Asp Val Asp Leu Gly Asp Asp Thr Arg Phe 255 260 265 att gag ttctat aat ttg gtt ttc atg cag tat aac aag acg gaa gat 868 Ile Glu Phe TyrAsn Leu Val Phe Met Gln Tyr Asn Lys Thr Glu Asp 270 275 280 gga ttg cttgag ccc ttg aaa cag aag aat ata gat act ggt ctt ggt 916 Gly Leu Leu GluPro Leu Lys Gln Lys Asn Ile Asp Thr Gly Leu Gly 285 290 295 300 ttg gaacgt ata gct caa atc ctt cag aag gtt cca aac aac tac gag 964 Leu Glu ArgIle Ala Gln Ile Leu Gln Lys Val Pro Asn Asn Tyr Glu 305 310 315 aca gatttg ata tat cca atc att gca aag atc tca gag ttg gcg aat 1012 Thr Asp LeuIle Tyr Pro Ile Ile Ala Lys Ile Ser Glu Leu Ala Asn 320 325 330 atc tcatat gac tct gca aat gac aag gca aag aca agt tta aaa gtg 1060 Ile Ser TyrAsp Ser Ala Asn Asp Lys Ala Lys Thr Ser Leu Lys Val 335 340 345 att gcagat cac atg cgg gca gtt gtc tat ctc ata tca gat ggt gtt 1108 Ile Ala AspHis Met Arg Ala Val Val Tyr Leu Ile Ser Asp Gly Val 350 355 360 tct ccttca aat att ggc aga ggt tat gtg gtt agg agg cta ata aga 1156 Ser Pro SerAsn Ile Gly Arg Gly Tyr Val Val Arg Arg Leu Ile Arg 365 370 375 380 agagca gtt cgg aag ggg aag tct ctc gga ata aat ggg gat atg aat 1204 Arg AlaVal Arg Lys Gly Lys Ser Leu Gly Ile Asn Gly Asp Met Asn 385 390 395 ggtaat cta aag gga gcg ttt ttg cca gcg gtt gct gaa aag gtg ata 1252 Gly AsnLeu Lys Gly Ala Phe Leu Pro Ala Val Ala Glu Lys Val Ile 400 405 410 gagttg agc act tat att gat tca gat gta aaa cta aag gcc tca cgc 1300 Glu LeuSer Thr Tyr Ile Asp Ser Asp Val Lys Leu Lys Ala Ser Arg 415 420 425 atcatt gag gag att agg caa gaa gaa ctt cac ttt aag aaa act ctg 1348 Ile IleGlu Glu Ile Arg Gln Glu Glu Leu His Phe Lys Lys Thr Leu 430 435 440 gaaaga gga gaa aag tta ctt gac caa aag ctt aac gat gca ttg tca 1396 Glu ArgGly Glu Lys Leu Leu Asp Gln Lys Leu Asn Asp Ala Leu Ser 445 450 455 460att gct gat aaa act aag gat acg cct tat ctg gat gga aaa gat gcg 1444 IleAla Asp Lys Thr Lys Asp Thr Pro Tyr Leu Asp Gly Lys Asp Ala 465 470 475ttt ctt ctt tat gac aca ttt ggc ttt cct gtg gag ata act gca gaa 1492 PheLeu Leu Tyr Asp Thr Phe Gly Phe Pro Val Glu Ile Thr Ala Glu 480 485 490gtt gct gaa gaa cgt gga gtc agt ata gat atg aat ggt ttt gaa gtg 1540 ValAla Glu Glu Arg Gly Val Ser Ile Asp Met Asn Gly Phe Glu Val 495 500 505gaa atg gag aat caa aga cgt caa tct caa gct gct cac aat gtt gta 1588 GluMet Glu Asn Gln Arg Arg Gln Ser Gln Ala Ala His Asn Val Val 510 515 520aaa ctg aca gtt gaa gac gat gct gac atg acg aaa aat att gca gac 1636 LysLeu Thr Val Glu Asp Asp Ala Asp Met Thr Lys Asn Ile Ala Asp 525 530 535540 act gag ttc ctt gga tat gac agt ctc tct gct cgt gct gtt gtg aaa 1684Thr Glu Phe Leu Gly Tyr Asp Ser Leu Ser Ala Arg Ala Val Val Lys 545 550555 agt ctt ttg gtg aat ggg aag cct gtg ata agg gtt tct gaa ggc agt 1732Ser Leu Leu Val Asn Gly Lys Pro Val Ile Arg Val Ser Glu Gly Ser 560 565570 gaa gta gag gtt ctg ctg gac aga act ccg ttc tat gct gaa tca gga 1780Glu Val Glu Val Leu Leu Asp Arg Thr Pro Phe Tyr Ala Glu Ser Gly 575 580585 ggt caa att gca gat cat ggt ttt ctt tat gtt agc agt gat ggg aac 1828Gly Gln Ile Ala Asp His Gly Phe Leu Tyr Val Ser Ser Asp Gly Asn 590 595600 caa gag aaa gct gtt gtt gag gta agt gat gtg cag aag tct ctt aaa 1876Gln Glu Lys Ala Val Val Glu Val Ser Asp Val Gln Lys Ser Leu Lys 605 610615 620 att ttt gtt cac aag ggc act gta aaa agt gga gct cta gaa gtt ggc1924 Ile Phe Val His Lys Gly Thr Val Lys Ser Gly Ala Leu Glu Val Gly 625630 635 aag gag gtg gaa gca gca gta gat gca gac ttg agg caa cga gcg aag1972 Lys Glu Val Glu Ala Ala Val Asp Ala Asp Leu Arg Gln Arg Ala Lys 640645 650 gtt cac cat acg gcc act cat ttg ctc caa tcg gca ctt aaa aaa gta2020 Val His His Thr Ala Thr His Leu Leu Gln Ser Ala Leu Lys Lys Val 655660 665 gta gga caa gaa aca tca cag gct ggt tca tta gta gct ttt gac cgc2068 Val Gly Gln Glu Thr Ser Gln Ala Gly Ser Leu Val Ala Phe Asp Arg 670675 680 ctc aga ttc gat ttc aat ttt aat cgg tcc ctg cat gat aat gag ctt2116 Leu Arg Phe Asp Phe Asn Phe Asn Arg Ser Leu His Asp Asn Glu Leu 685690 695 700 gag gaa atc gaa tgc ctg atc aat agg tgg att ggg gat gct acacgt 2164 Glu Glu Ile Glu Cys Leu Ile Asn Arg Trp Ile Gly Asp Ala Thr Arg705 710 715 ctt gaa aca aaa gtc ctt cct ctt gct gat gca aaa cgt gct ggagcc 2212 Leu Glu Thr Lys Val Leu Pro Leu Ala Asp Ala Lys Arg Ala Gly Ala720 725 730 atc gca atg ttt ggg gaa aaa tat gat gaa aac gag gtt cgt gtagta 2260 Ile Ala Met Phe Gly Glu Lys Tyr Asp Glu Asn Glu Val Arg Val Val735 740 745 gaa gtt cct ggt gtc tcc atg gaa ctt tgt ggt ggc act cat gttggc 2308 Glu Val Pro Gly Val Ser Met Glu Leu Cys Gly Gly Thr His Val Gly750 755 760 aat act gca gaa ata cga gcc ttc aag att atc tca gaa cag ggcatt 2356 Asn Thr Ala Glu Ile Arg Ala Phe Lys Ile Ile Ser Glu Gln Gly Ile765 770 775 780 gca tct gga atc cgg cgt ata gaa gcg gtt gca ggt gaa gcattc att 2404 Ala Ser Gly Ile Arg Arg Ile Glu Ala Val Ala Gly Glu Ala PheIle 785 790 795 gaa tac ata aac tca cgg gat tct caa atg aca cgt cta tgctcg act 2452 Glu Tyr Ile Asn Ser Arg Asp Ser Gln Met Thr Arg Leu Cys SerThr 800 805 810 ctc aag gtg aaa gca gag gat gtt aca aac aga gtg gag aatctt cta 2500 Leu Lys Val Lys Ala Glu Asp Val Thr Asn Arg Val Glu Asn LeuLeu 815 820 825 gag gaa cta cgt gct gct aga aaa gaa gcc tcc gac ttg cgttca aaa 2548 Glu Glu Leu Arg Ala Ala Arg Lys Glu Ala Ser Asp Leu Arg SerLys 830 835 840 gca gct gtc tat aaa gca tct gtc ata tcg aac aaa gca tttact gta 2596 Ala Ala Val Tyr Lys Ala Ser Val Ile Ser Asn Lys Ala Phe ThrVal 845 850 855 860 gga act tca cag act ata aga gtg ctc gtt gag tcg atggat gac acc 2644 Gly Thr Ser Gln Thr Ile Arg Val Leu Val Glu Ser Met AspAsp Thr 865 870 875 gat gct gac tca tta aag agt gca gct gag cat ttg ataagc aca ttg 2692 Asp Ala Asp Ser Leu Lys Ser Ala Ala Glu His Leu Ile SerThr Leu 880 885 890 gaa gat cca gtc gct gtg gta cta gga tca tct cca gaaaaa gac aag 2740 Glu Asp Pro Val Ala Val Val Leu Gly Ser Ser Pro Glu LysAsp Lys 895 900 905 gtt agt tta gtt gct gca ttt agt cct gga gta gtc tcccta ggt gtt 2788 Val Ser Leu Val Ala Ala Phe Ser Pro Gly Val Val Ser LeuGly Val 910 915 920 caa gca ggg aaa ttc att ggc ccc ata gct aag ctg tgtggc gga gga 2836 Gln Ala Gly Lys Phe Ile Gly Pro Ile Ala Lys Leu Cys GlyGly Gly 925 930 935 940 ggt ggt gga aag ccc aat ttt gct cag gca ggc ggcaga aag cct gaa 2884 Gly Gly Gly Lys Pro Asn Phe Ala Gln Ala Gly Gly ArgLys Pro Glu 945 950 955 aat ctc cca agt gcc tta gag aaa gct cgg gaa gatctc gtg gca act 2932 Asn Leu Pro Ser Ala Leu Glu Lys Ala Arg Glu Asp LeuVal Ala Thr 960 965 970 cta ttc gaa aag cta ggg tga agcacaaacttcaaaagtga tctgcgtgta 2983 Leu Phe Glu Lys Leu Gly 975 cagagagaaggaagagcaca ttgcttgatt ctagacaagt gtattgcatg tatagatgat 3043 agacattaaagatatttgat gtatctagtt tttgaacatt aaatgatcaa tgacatttct 3103 tttaatgaaaaaaaaaaaaa aaaaaaaaaa aaaaa 3138 8 978 PRT Arabidopsis thaliana 8 MetAsn Phe Ser Arg Val Asn Leu Phe Asp Phe Pro Leu Arg Pro Ile 1 5 10 15Leu Leu Ser His Pro Ser Ser Ile Phe Val Ser Thr Arg Phe Val Thr 20 25 30Arg Thr Ser Ala Gly Val Ser Pro Ser Ile Leu Leu Pro Arg Ser Thr 35 40 45Gln Ser Pro Gln Ile Ile Ala Lys Ser Ser Ser Val Ser Val Gln Pro 50 55 60Val Ser Glu Asp Ala Lys Glu Asp Tyr Gln Ser Lys Asp Val Ser Gly 65 70 7580 Asp Ser Ile Arg Arg Arg Phe Leu Glu Phe Phe Ala Ser Arg Gly His 85 9095 Lys Val Leu Pro Ser Ser Ser Leu Val Pro Glu Asp Pro Thr Val Leu 100105 110 Leu Thr Ile Ala Gly Met Leu Gln Phe Lys Pro Ile Phe Leu Gly Lys115 120 125 Val Pro Arg Glu Val Pro Cys Ala Thr Thr Ala Gln Arg Cys IleArg 130 135 140 Thr Asn Asp Leu Glu Asn Val Gly Lys Thr Ala Arg His HisThr Phe 145 150 155 160 Phe Glu Met Leu Gly Asn Phe Ser Phe Gly Asp TyrPhe Lys Lys Glu 165 170 175 Ala Ile Lys Trp Ala Trp Glu Leu Ser Thr IleGlu Phe Gly Leu Pro 180 185 190 Ala Asn Arg Val Trp Val Ser Ile Tyr GluAsp Asp Asp Glu Ala Phe 195 200 205 Glu Ile Trp Lys Asn Glu Val Gly ValSer Val Glu Arg Ile Lys Arg 210 215 220 Met Gly Glu Ala Asp Asn Phe TrpThr Ser Gly Pro Thr Gly Pro Cys 225 230 235 240 Gly Pro Cys Ser Glu LeuTyr Tyr Asp Phe Tyr Pro Glu Arg Gly Tyr 245 250 255 Asp Glu Asp Val AspLeu Gly Asp Asp Thr Arg Phe Ile Glu Phe Tyr 260 265 270 Asn Leu Val PheMet Gln Tyr Asn Lys Thr Glu Asp Gly Leu Leu Glu 275 280 285 Pro Leu LysGln Lys Asn Ile Asp Thr Gly Leu Gly Leu Glu Arg Ile 290 295 300 Ala GlnIle Leu Gln Lys Val Pro Asn Asn Tyr Glu Thr Asp Leu Ile 305 310 315 320Tyr Pro Ile Ile Ala Lys Ile Ser Glu Leu Ala Asn Ile Ser Tyr Asp 325 330335 Ser Ala Asn Asp Lys Ala Lys Thr Ser Leu Lys Val Ile Ala Asp His 340345 350 Met Arg Ala Val Val Tyr Leu Ile Ser Asp Gly Val Ser Pro Ser Asn355 360 365 Ile Gly Arg Gly Tyr Val Val Arg Arg Leu Ile Arg Arg Ala ValArg 370 375 380 Lys Gly Lys Ser Leu Gly Ile Asn Gly Asp Met Asn Gly AsnLeu Lys 385 390 395 400 Gly Ala Phe Leu Pro Ala Val Ala Glu Lys Val IleGlu Leu Ser Thr 405 410 415 Tyr Ile Asp Ser Asp Val Lys Leu Lys Ala SerArg Ile Ile Glu Glu 420 425 430 Ile Arg Gln Glu Glu Leu His Phe Lys LysThr Leu Glu Arg Gly Glu 435 440 445 Lys Leu Leu Asp Gln Lys Leu Asn AspAla Leu Ser Ile Ala Asp Lys 450 455 460 Thr Lys Asp Thr Pro Tyr Leu AspGly Lys Asp Ala Phe Leu Leu Tyr 465 470 475 480 Asp Thr Phe Gly Phe ProVal Glu Ile Thr Ala Glu Val Ala Glu Glu 485 490 495 Arg Gly Val Ser IleAsp Met Asn Gly Phe Glu Val Glu Met Glu Asn 500 505 510 Gln Arg Arg GlnSer Gln Ala Ala His Asn Val Val Lys Leu Thr Val 515 520 525 Glu Asp AspAla Asp Met Thr Lys Asn Ile Ala Asp Thr Glu Phe Leu 530 535 540 Gly TyrAsp Ser Leu Ser Ala Arg Ala Val Val Lys Ser Leu Leu Val 545 550 555 560Asn Gly Lys Pro Val Ile Arg Val Ser Glu Gly Ser Glu Val Glu Val 565 570575 Leu Leu Asp Arg Thr Pro Phe Tyr Ala Glu Ser Gly Gly Gln Ile Ala 580585 590 Asp His Gly Phe Leu Tyr Val Ser Ser Asp Gly Asn Gln Glu Lys Ala595 600 605 Val Val Glu Val Ser Asp Val Gln Lys Ser Leu Lys Ile Phe ValHis 610 615 620 Lys Gly Thr Val Lys Ser Gly Ala Leu Glu Val Gly Lys GluVal Glu 625 630 635 640 Ala Ala Val Asp Ala Asp Leu Arg Gln Arg Ala LysVal His His Thr 645 650 655 Ala Thr His Leu Leu Gln Ser Ala Leu Lys LysVal Val Gly Gln Glu 660 665 670 Thr Ser Gln Ala Gly Ser Leu Val Ala PheAsp Arg Leu Arg Phe Asp 675 680 685 Phe Asn Phe Asn Arg Ser Leu His AspAsn Glu Leu Glu Glu Ile Glu 690 695 700 Cys Leu Ile Asn Arg Trp Ile GlyAsp Ala Thr Arg Leu Glu Thr Lys 705 710 715 720 Val Leu Pro Leu Ala AspAla Lys Arg Ala Gly Ala Ile Ala Met Phe 725 730 735 Gly Glu Lys Tyr AspGlu Asn Glu Val Arg Val Val Glu Val Pro Gly 740 745 750 Val Ser Met GluLeu Cys Gly Gly Thr His Val Gly Asn Thr Ala Glu 755 760 765 Ile Arg AlaPhe Lys Ile Ile Ser Glu Gln Gly Ile Ala Ser Gly Ile 770 775 780 Arg ArgIle Glu Ala Val Ala Gly Glu Ala Phe Ile Glu Tyr Ile Asn 785 790 795 800Ser Arg Asp Ser Gln Met Thr Arg Leu Cys Ser Thr Leu Lys Val Lys 805 810815 Ala Glu Asp Val Thr Asn Arg Val Glu Asn Leu Leu Glu Glu Leu Arg 820825 830 Ala Ala Arg Lys Glu Ala Ser Asp Leu Arg Ser Lys Ala Ala Val Tyr835 840 845 Lys Ala Ser Val Ile Ser Asn Lys Ala Phe Thr Val Gly Thr SerGln 850 855 860 Thr Ile Arg Val Leu Val Glu Ser Met Asp Asp Thr Asp AlaAsp Ser 865 870 875 880 Leu Lys Ser Ala Ala Glu His Leu Ile Ser Thr LeuGlu Asp Pro Val 885 890 895 Ala Val Val Leu Gly Ser Ser Pro Glu Lys AspLys Val Ser Leu Val 900 905 910 Ala Ala Phe Ser Pro Gly Val Val Ser LeuGly Val Gln Ala Gly Lys 915 920 925 Phe Ile Gly Pro Ile Ala Lys Leu CysGly Gly Gly Gly Gly Gly Lys 930 935 940 Pro Asn Phe Ala Gln Ala Gly GlyArg Lys Pro Glu Asn Leu Pro Ser 945 950 955 960 Ala Leu Glu Lys Ala ArgGlu Asp Leu Val Ala Thr Leu Phe Glu Lys 965 970 975 Leu Gly 9 16 DNAArtificial Sequence Description of Artificial Sequence oligonucleotide 9ngtcgaswga nawgaa 16 10 16 DNA Artificial Sequence Description ofArtificial Sequence oligonucleotide 10 tgwgnagsan casaga 16 11 16 DNAArtificial Sequence Description of Artificial Sequence oligonucleotide11 agwgnagwan cawagg 16 12 16 DNA Artificial Sequence Description ofArtificial Sequence oligonucleotide 12 sttgntastn ctntgc 16 13 15 DNAArtificial Sequence Description of Artificial Sequence oligonucleotide13 ntcgastwts gwgtt 15 14 16 DNA Artificial Sequence Description ofArtificial Sequence oligonucleotide 14 wgtgnagwan canaga 16 15 29 DNAArtificial Sequence Description of Artificial Sequence oligonucleotide15 attaggcacc ccaggcttta cactttatg 29 16 30 DNA Artificial SequenceDescription of Artificial Sequence oligonucleotide 16 gtatgttgtgtggaattgtg agcggataac 30 17 30 DNA Artificial Sequence Description ofArtificial Sequence oligonucleotide 17 taacaatttc acacaggaaa cagctatgac30 18 34 DNA Artificial Sequence Description of Artificial Sequenceoligonucleotide 18 tagcatctga atttcataac caatctcgat acac 34 19 34 DNAArtificial Sequence Description of Artificial Sequence oligonucleotide19 gcttcctatt atatcttccc aaattaccaa taca 34 20 34 DNA ArtificialSequence Description of Artificial Sequence oligonucleotide 20gccttttcag aaatggataa atagccttgc ttcc 34 21 1030 DNA Arabidopsisthaliana CDS (74)..(847) 21 tcgacttcct cttcctctga ctttgagcag ctctgtcttcttctcgaaat cgtctcctgt 60 ttcttctgct ttc atg gat gct tca aat ccc aat tcttct aga aaa tct 109 Met Asp Ala Ser Asn Pro Asn Ser Ser Arg Lys Ser 1 510 aat gtc tct tcc ttc gct cag tcc agt cga agc ggt ggt aga gga gga 157Asn Val Ser Ser Phe Ala Gln Ser Ser Arg Ser Gly Gly Arg Gly Gly 15 20 25gga tat gag aga gat aac gat cga cgg aga cct cag ggt cgt ggc gac 205 GlyTyr Glu Arg Asp Asn Asp Arg Arg Arg Pro Gln Gly Arg Gly Asp 30 35 40 ggtgga ggc gga aag gat aga atc gat gca ctt gga cga ctc ttg acg 253 Gly GlyGly Gly Lys Asp Arg Ile Asp Ala Leu Gly Arg Leu Leu Thr 45 50 55 60 agaata ttg cga cat atg gct act gag ctg aga ttg aac atg aga ggt 301 Arg IleLeu Arg His Met Ala Thr Glu Leu Arg Leu Asn Met Arg Gly 65 70 75 gat ggtttt gtt aaa gtt gaa gat tta ctt aac ctg aat ttg aaa act 349 Asp Gly PheVal Lys Val Glu Asp Leu Leu Asn Leu Asn Leu Lys Thr 80 85 90 tct gca aatatt cag tta aag tca cac acg att gat gaa att aga gag 397 Ser Ala Asn IleGln Leu Lys Ser His Thr Ile Asp Glu Ile Arg Glu 95 100 105 gct gtg agaagg gac aat aag caa cgg ttt agt ctc atc gat gag aat 445 Ala Val Arg ArgAsp Asn Lys Gln Arg Phe Ser Leu Ile Asp Glu Asn 110 115 120 gga gag ctcttg att cgc gct aac caa ggc cat tcg atc acg acg gtt 493 Gly Glu Leu LeuIle Arg Ala Asn Gln Gly His Ser Ile Thr Thr Val 125 130 135 140 gag tcagag aag tta ctt aaa cca ata ctg tca cca gaa gaa gct cca 541 Glu Ser GluLys Leu Leu Lys Pro Ile Leu Ser Pro Glu Glu Ala Pro 145 150 155 gtg tgtgta cat gga act tat agg aag aat ttg gaa tcc atc tta gca 589 Val Cys ValHis Gly Thr Tyr Arg Lys Asn Leu Glu Ser Ile Leu Ala 160 165 170 tcg ggctta aag cgt atg aat aga atg cat gtt cac ttc tct tgt gga 637 Ser Gly LeuLys Arg Met Asn Arg Met His Val His Phe Ser Cys Gly 175 180 185 tta ccaaca gat ggt gaa gtg att agt ggc atg aga aga aat gta aat 685 Leu Pro ThrAsp Gly Glu Val Ile Ser Gly Met Arg Arg Asn Val Asn 190 195 200 gtt atcatc ttc ctc gac atc aag aaa gct ctt gaa gat ggg att gcg 733 Val Ile IlePhe Leu Asp Ile Lys Lys Ala Leu Glu Asp Gly Ile Ala 205 210 215 220 ttctac ata tca gac aac aaa gtg att ttg act gaa ggc att gat ggt 781 Phe TyrIle Ser Asp Asn Lys Val Ile Leu Thr Glu Gly Ile Asp Gly 225 230 235 gtattg cct gtc gat tac ttc cag aag atc gag tct tgg cct gat cgg 829 Val LeuPro Val Asp Tyr Phe Gln Lys Ile Glu Ser Trp Pro Asp Arg 240 245 250 caatcc ata cct ttc tga ttcatataat tcaacatcat gcgaagattg 877 Gln Ser Ile ProPhe 255 acaggatcct atgacaatga ttgtgaggat tcttctgaac cttgattatgtaatgttgtc 937 tcagtgtttt caattgcaca tatgacaatt tatgaaaact ttcaagattatgttgtttcc 997 tttgcccaaa gaaaaaaaaa aaaaaaaaaa aaa 1030 22 257 PRTArabidopsis thaliana 22 Met Asp Ala Ser Asn Pro Asn Ser Ser Arg Lys SerAsn Val Ser Ser 1 5 10 15 Phe Ala Gln Ser Ser Arg Ser Gly Gly Arg GlyGly Gly Tyr Glu Arg 20 25 30 Asp Asn Asp Arg Arg Arg Pro Gln Gly Arg GlyAsp Gly Gly Gly Gly 35 40 45 Lys Asp Arg Ile Asp Ala Leu Gly Arg Leu LeuThr Arg Ile Leu Arg 50 55 60 His Met Ala Thr Glu Leu Arg Leu Asn Met ArgGly Asp Gly Phe Val 65 70 75 80 Lys Val Glu Asp Leu Leu Asn Leu Asn LeuLys Thr Ser Ala Asn Ile 85 90 95 Gln Leu Lys Ser His Thr Ile Asp Glu IleArg Glu Ala Val Arg Arg 100 105 110 Asp Asn Lys Gln Arg Phe Ser Leu IleAsp Glu Asn Gly Glu Leu Leu 115 120 125 Ile Arg Ala Asn Gln Gly His SerIle Thr Thr Val Glu Ser Glu Lys 130 135 140 Leu Leu Lys Pro Ile Leu SerPro Glu Glu Ala Pro Val Cys Val His 145 150 155 160 Gly Thr Tyr Arg LysAsn Leu Glu Ser Ile Leu Ala Ser Gly Leu Lys 165 170 175 Arg Met Asn ArgMet His Val His Phe Ser Cys Gly Leu Pro Thr Asp 180 185 190 Gly Glu ValIle Ser Gly Met Arg Arg Asn Val Asn Val Ile Ile Phe 195 200 205 Leu AspIle Lys Lys Ala Leu Glu Asp Gly Ile Ala Phe Tyr Ile Ser 210 215 220 AspAsn Lys Val Ile Leu Thr Glu Gly Ile Asp Gly Val Leu Pro Val 225 230 235240 Asp Tyr Phe Gln Lys Ile Glu Ser Trp Pro Asp Arg Gln Ser Ile Pro 245250 255 Phe 23 1929 DNA Arabidopsis thaliana CDS (1)..(1929) 23 atg ttcatt ttc cca aaa gac gaa aac aga aga gaa act tta acg aca 48 Met Phe IlePhe Pro Lys Asp Glu Asn Arg Arg Glu Thr Leu Thr Thr 1 5 10 15 aag ctccgt ttc tcc gcc gat cat ctg act ttt acc acc gtg aca gaa 96 Lys Leu ArgPhe Ser Ala Asp His Leu Thr Phe Thr Thr Val Thr Glu 20 25 30 aaa ttg agagca acg gct tgg aga ttt gct ttc tca tcc aga gct aag 144 Lys Leu Arg AlaThr Ala Trp Arg Phe Ala Phe Ser Ser Arg Ala Lys 35 40 45 tcc gtg gta gcaatg gca gct aat gaa gaa ttt acg gga aat ctg aaa 192 Ser Val Val Ala MetAla Ala Asn Glu Glu Phe Thr Gly Asn Leu Lys 50 55 60 cgt caa ctc gcg aagctc ttt gat gtt tct cta aaa tta acg gtt cct 240 Arg Gln Leu Ala Lys LeuPhe Asp Val Ser Leu Lys Leu Thr Val Pro 65 70 75 80 gat gaa cct agt gttgag ccc ttg gtg gct gcc tcc gct ctt gga aaa 288 Asp Glu Pro Ser Val GluPro Leu Val Ala Ala Ser Ala Leu Gly Lys 85 90 95 ttt gga gat tac caa tgtaac aac gca atg gga cta tgg tcc ata att 336 Phe Gly Asp Tyr Gln Cys AsnAsn Ala Met Gly Leu Trp Ser Ile Ile 100 105 110 aaa gga aag ggt act cagttc aag ggt cct cca gct gtt gga cag gcc 384 Lys Gly Lys Gly Thr Gln PheLys Gly Pro Pro Ala Val Gly Gln Ala 115 120 125 ctt gtt aag agt ctc cctact tct gag atg gta gaa tca tgc tct gta 432 Leu Val Lys Ser Leu Pro ThrSer Glu Met Val Glu Ser Cys Ser Val 130 135 140 gct gga cct ggc ttt attaat gtt gta cta tca gct aag tgg atg gct 480 Ala Gly Pro Gly Phe Ile AsnVal Val Leu Ser Ala Lys Trp Met Ala 145 150 155 160 aag agt att gaa aatatg ctc atc gat gga gtt gac aca tgg gca cct 528 Lys Ser Ile Glu Asn MetLeu Ile Asp Gly Val Asp Thr Trp Ala Pro 165 170 175 act ctt tcg gtt aagaga gct gta gtt gat ttt tcc tct ccc aac att 576 Thr Leu Ser Val Lys ArgAla Val Val Asp Phe Ser Ser Pro Asn Ile 180 185 190 gca aaa gaa atg catgtt ggt cat cta aga tca act atc att ggt gac 624 Ala Lys Glu Met His ValGly His Leu Arg Ser Thr Ile Ile Gly Asp 195 200 205 act cta gct cgc atgctc gag tac tca cat gtt gaa gtt cta cgc aga 672 Thr Leu Ala Arg Met LeuGlu Tyr Ser His Val Glu Val Leu Arg Arg 210 215 220 aac cat gtt ggt gactgg gga aca cag ttt ggc atg cta att gag tac 720 Asn His Val Gly Asp TrpGly Thr Gln Phe Gly Met Leu Ile Glu Tyr 225 230 235 240 ctc ttt gag aaattt cct gat aca gat agt gtg acc gag aca gca att 768 Leu Phe Glu Lys PhePro Asp Thr Asp Ser Val Thr Glu Thr Ala Ile 245 250 255 gga gat ctt caggtg ttt tac aag gca tca aaa cat aaa ttt gat ctg 816 Gly Asp Leu Gln ValPhe Tyr Lys Ala Ser Lys His Lys Phe Asp Leu 260 265 270 gac gag gcc tttaag gaa aaa gca caa cag gct gtg gtc cgt cta cag 864 Asp Glu Ala Phe LysGlu Lys Ala Gln Gln Ala Val Val Arg Leu Gln 275 280 285 ggt ggt gat cctgtt tac cgt aag gct tgg gct aag atc tgt gac atc 912 Gly Gly Asp Pro ValTyr Arg Lys Ala Trp Ala Lys Ile Cys Asp Ile 290 295 300 agc cga act gagttt gcc aag gtt tac caa cgc ctt cga gtt gag ctt 960 Ser Arg Thr Glu PheAla Lys Val Tyr Gln Arg Leu Arg Val Glu Leu 305 310 315 320 gaa gaa aaggga gaa agc ttt tac aac cct cat att gct aaa gta att 1008 Glu Glu Lys GlyGlu Ser Phe Tyr Asn Pro His Ile Ala Lys Val Ile 325 330 335 gag gaa ttgaat agc aag ggg ttg gtt gaa gaa agt gaa ggt gct cgt 1056 Glu Glu Leu AsnSer Lys Gly Leu Val Glu Glu Ser Glu Gly Ala Arg 340 345 350 gtg att ttcctt gaa ggc ttc gac atc cca ctc atg gtt gta aag agt 1104 Val Ile Phe LeuGlu Gly Phe Asp Ile Pro Leu Met Val Val Lys Ser 355 360 365 gat ggt ggtttt aac tat gcc tca aca gat ctg act gct ctt tgg tac 1152 Asp Gly Gly PheAsn Tyr Ala Ser Thr Asp Leu Thr Ala Leu Trp Tyr 370 375 380 cgg ctc aatgaa gag aaa gct gag tgg atc ata tat gtg acc gat gtt 1200 Arg Leu Asn GluGlu Lys Ala Glu Trp Ile Ile Tyr Val Thr Asp Val 385 390 395 400 ggc cagcag cag cac ttt aat atg ttc ttc aaa gct gcc aga aaa gca 1248 Gly Gln GlnGln His Phe Asn Met Phe Phe Lys Ala Ala Arg Lys Ala 405 410 415 ggt tggctt cca gac aat gat aaa act tac cct aga gtt aac cat gtt 1296 Gly Trp LeuPro Asp Asn Asp Lys Thr Tyr Pro Arg Val Asn His Val 420 425 430 ggt tttggt ctc gtc ctt ggg gaa gat ggc aag cga ttt aga act cgg 1344 Gly Phe GlyLeu Val Leu Gly Glu Asp Gly Lys Arg Phe Arg Thr Arg 435 440 445 gca acagat gta gtc cgc cta gtt gat ttg cta gat gag gcc aag act 1392 Ala Thr AspVal Val Arg Leu Val Asp Leu Leu Asp Glu Ala Lys Thr 450 455 460 cgc agtaaa ctt gcc ctt att gag cgc ggt aag gac aaa gaa tgg aca 1440 Arg Ser LysLeu Ala Leu Ile Glu Arg Gly Lys Asp Lys Glu Trp Thr 465 470 475 480 ccggaa gaa ctg gac caa aca gct gag gca gtt gga tat ggt gcg gtc 1488 Pro GluGlu Leu Asp Gln Thr Ala Glu Ala Val Gly Tyr Gly Ala Val 485 490 495 aagtat gct gac ctg aag aac aac aga tta aca aat tat act ttc agc 1536 Lys TyrAla Asp Leu Lys Asn Asn Arg Leu Thr Asn Tyr Thr Phe Ser 500 505 510 tttgat caa atg ctt aat gac aag gga aat aca gcc gtt tac ctt ctt 1584 Phe AspGln Met Leu Asn Asp Lys Gly Asn Thr Ala Val Tyr Leu Leu 515 520 525 tacgcc cat gct cgg atc tgt tca atc atc aga aag tct ggc aaa gac 1632 Tyr AlaHis Ala Arg Ile Cys Ser Ile Ile Arg Lys Ser Gly Lys Asp 530 535 540 atagat gag ctg aaa aag aca gga aaa tta gca ttg gat cat gca gat 1680 Ile AspGlu Leu Lys Lys Thr Gly Lys Leu Ala Leu Asp His Ala Asp 545 550 555 560gaa cga gca ctg ggg ctt cac ttg ctt cga ttt gct gag acg gtg gag 1728 GluArg Ala Leu Gly Leu His Leu Leu Arg Phe Ala Glu Thr Val Glu 565 570 575gaa gct tgt acc aac tta tta ccg agt gtt ctg tgc gag tac ctc tac 1776 GluAla Cys Thr Asn Leu Leu Pro Ser Val Leu Cys Glu Tyr Leu Tyr 580 585 590aat tta tct gaa cac ttt acc aga ttc tac tcc aat tgt cag gtc aat 1824 AsnLeu Ser Glu His Phe Thr Arg Phe Tyr Ser Asn Cys Gln Val Asn 595 600 605ggt tca cca gag gag aca agc cgt ctc cta ctt tgt gaa gca acg gcc 1872 GlySer Pro Glu Glu Thr Ser Arg Leu Leu Leu Cys Glu Ala Thr Ala 610 615 620ata gtc atg cgg aaa tgc ttc cac ctt ctt gga atc act ccg gtt tac 1920 IleVal Met Arg Lys Cys Phe His Leu Leu Gly Ile Thr Pro Val Tyr 625 630 635640 aag att tga 1929 Lys Ile 24 642 PRT Arabidopsis thaliana 24 Met PheIle Phe Pro Lys Asp Glu Asn Arg Arg Glu Thr Leu Thr Thr 1 5 10 15 LysLeu Arg Phe Ser Ala Asp His Leu Thr Phe Thr Thr Val Thr Glu 20 25 30 LysLeu Arg Ala Thr Ala Trp Arg Phe Ala Phe Ser Ser Arg Ala Lys 35 40 45 SerVal Val Ala Met Ala Ala Asn Glu Glu Phe Thr Gly Asn Leu Lys 50 55 60 ArgGln Leu Ala Lys Leu Phe Asp Val Ser Leu Lys Leu Thr Val Pro 65 70 75 80Asp Glu Pro Ser Val Glu Pro Leu Val Ala Ala Ser Ala Leu Gly Lys 85 90 95Phe Gly Asp Tyr Gln Cys Asn Asn Ala Met Gly Leu Trp Ser Ile Ile 100 105110 Lys Gly Lys Gly Thr Gln Phe Lys Gly Pro Pro Ala Val Gly Gln Ala 115120 125 Leu Val Lys Ser Leu Pro Thr Ser Glu Met Val Glu Ser Cys Ser Val130 135 140 Ala Gly Pro Gly Phe Ile Asn Val Val Leu Ser Ala Lys Trp MetAla 145 150 155 160 Lys Ser Ile Glu Asn Met Leu Ile Asp Gly Val Asp ThrTrp Ala Pro 165 170 175 Thr Leu Ser Val Lys Arg Ala Val Val Asp Phe SerSer Pro Asn Ile 180 185 190 Ala Lys Glu Met His Val Gly His Leu Arg SerThr Ile Ile Gly Asp 195 200 205 Thr Leu Ala Arg Met Leu Glu Tyr Ser HisVal Glu Val Leu Arg Arg 210 215 220 Asn His Val Gly Asp Trp Gly Thr GlnPhe Gly Met Leu Ile Glu Tyr 225 230 235 240 Leu Phe Glu Lys Phe Pro AspThr Asp Ser Val Thr Glu Thr Ala Ile 245 250 255 Gly Asp Leu Gln Val PheTyr Lys Ala Ser Lys His Lys Phe Asp Leu 260 265 270 Asp Glu Ala Phe LysGlu Lys Ala Gln Gln Ala Val Val Arg Leu Gln 275 280 285 Gly Gly Asp ProVal Tyr Arg Lys Ala Trp Ala Lys Ile Cys Asp Ile 290 295 300 Ser Arg ThrGlu Phe Ala Lys Val Tyr Gln Arg Leu Arg Val Glu Leu 305 310 315 320 GluGlu Lys Gly Glu Ser Phe Tyr Asn Pro His Ile Ala Lys Val Ile 325 330 335Glu Glu Leu Asn Ser Lys Gly Leu Val Glu Glu Ser Glu Gly Ala Arg 340 345350 Val Ile Phe Leu Glu Gly Phe Asp Ile Pro Leu Met Val Val Lys Ser 355360 365 Asp Gly Gly Phe Asn Tyr Ala Ser Thr Asp Leu Thr Ala Leu Trp Tyr370 375 380 Arg Leu Asn Glu Glu Lys Ala Glu Trp Ile Ile Tyr Val Thr AspVal 385 390 395 400 Gly Gln Gln Gln His Phe Asn Met Phe Phe Lys Ala AlaArg Lys Ala 405 410 415 Gly Trp Leu Pro Asp Asn Asp Lys Thr Tyr Pro ArgVal Asn His Val 420 425 430 Gly Phe Gly Leu Val Leu Gly Glu Asp Gly LysArg Phe Arg Thr Arg 435 440 445 Ala Thr Asp Val Val Arg Leu Val Asp LeuLeu Asp Glu Ala Lys Thr 450 455 460 Arg Ser Lys Leu Ala Leu Ile Glu ArgGly Lys Asp Lys Glu Trp Thr 465 470 475 480 Pro Glu Glu Leu Asp Gln ThrAla Glu Ala Val Gly Tyr Gly Ala Val 485 490 495 Lys Tyr Ala Asp Leu LysAsn Asn Arg Leu Thr Asn Tyr Thr Phe Ser 500 505 510 Phe Asp Gln Met LeuAsn Asp Lys Gly Asn Thr Ala Val Tyr Leu Leu 515 520 525 Tyr Ala His AlaArg Ile Cys Ser Ile Ile Arg Lys Ser Gly Lys Asp 530 535 540 Ile Asp GluLeu Lys Lys Thr Gly Lys Leu Ala Leu Asp His Ala Asp 545 550 555 560 GluArg Ala Leu Gly Leu His Leu Leu Arg Phe Ala Glu Thr Val Glu 565 570 575Glu Ala Cys Thr Asn Leu Leu Pro Ser Val Leu Cys Glu Tyr Leu Tyr 580 585590 Asn Leu Ser Glu His Phe Thr Arg Phe Tyr Ser Asn Cys Gln Val Asn 595600 605 Gly Ser Pro Glu Glu Thr Ser Arg Leu Leu Leu Cys Glu Ala Thr Ala610 615 620 Ile Val Met Arg Lys Cys Phe His Leu Leu Gly Ile Thr Pro ValTyr 625 630 635 640 Lys Ile 25 20 DNA Artificial Sequence Description ofArtificial Sequence oligonucleotide 25 gcggacatct acatttttga 20 26 31DNA Artificial Sequence Description of Artificial Sequenceoligonucleotide 26 acttcactgc cttcagaaac ccttatcaca g 31 27 31 DNAArtificial Sequence Description of Artificial Sequence oligonucleotide27 cttatcacag gcttcccatt caccaaaaga c 31

What is claimed is:
 1. A method of identifying herbicidal compounds,comprising: a) combining a polypeptide comprising an amino acid sequenceat least 85% identical to SEQ ID NO:2, 4, or 6 and a compound to betested for the ability to bind to said polypeptide, under conditionsconducive to binding; b) selecting a compound identified in (a) thatbinds to said polypeptide; c) applying a compound selected in (b) to aplant to test for herbicidal activity; and d) selecting a compoundidentified in (c) that has herbicidal activity.
 2. The method of claim1, wherein said polypeptide comprises an amino acid sequence at least95% identical to SEQ ID NO:2, 4, or
 6. 3. The method of claim 1, whereinsaid polypeptide comprises an amino acid sequence at least 99% identicalto SEQ ID NO:2, 4, or
 6. 4. The method of claim 1, wherein saidpolypeptide comprises SEQ ID NO:2, 4, or
 6. 5. A method of identifyingherbicidal compounds, comprising: a) combining a polypeptide comprisingan amino acid sequence at least 85% identical to SEQ ID NO:2, 4, or 6and a compound to be tested for the ability to inhibit said polypeptide,under conditions conducive to inhibition; b) selecting a compoundidentified in (a) that inhibits said polypeptide; c) applying a compoundselected in (b) to a plant to test for herbicidal activity; and d)selecting a compound identified in (c) that has herbicidal activity. 6.The method of claim 1, wherein said polypeptide comprises an amino acidsequence at least 95% identical to SEQ ID NO:2, 4, or
 6. 7. The methodof claim 1, wherein said polypeptide comprises an amino acid sequence atleast 99% identical to SEQ ID NO:2, 4, or
 6. 8. The method of claim 1,wherein said polypeptide comprises SEQ ID NO:2, 4, or 6.